Thursday, November 02, 2006

Why we use Fortran and Python

From Mark Chu-Carroll, a post on why C/C++ aren't always fastest, which is of course well known in (large parts of) the scientific computing community: Fortran compilers can perform much better optimisation than C compilers, because Fortran has true arrays and loop constructs, as opposed to C's sugar-coated assembler. C is a great language to develop an operating system or a device driver, but not to write computationally intensive code that could benefit from parallelisation, where Fortran beats it easily. And what about C++? The object-oriented features of C++ are nice for scientific applications, sure; you can have matrix, vector and spinor types with overloaded operators and such. But Fortran 95 has those things, too, but doesn't suffer from the problems that C++'s C-heritage brings. And Fortran 95 has even nicer features, such as elemental functions; that's something that no C-based language can give you because of C's poor support for arrays. And in case there is some object-oriented feature that you feel is missing in Fortran 95, just wait for Fortran 2003, which includes those as well.

But what about developing graphical user interfaces? Fortran doesn't have good support for those, now, does it? No, it doesn't, but that's besides the point; Fortran ("FORmula TRANslation") is meant as a number-crunching language. I wouldn't want to write a graphical user interface for my code in either Fortran or C/C++. For these kinds of tasks, I use Python, because it is the most convenient language for them; the user interface is not computationally intensive, so speed isn't crucial, and for the number crunching, the front end calls the fast Fortran program, getting you the best of both worlds -- and without using any C anywhere (other than the fact that the Python interpreter, and probably the Fortran compiler, were written in C, which is the right language for those kinds of tasks).

Most lattice people I have personally worked with use Fortran, and a few use Python for non-numerical tasks. Of course there are large groups that use C or C++ exclusively, and depending on what they are doing, it may make sense, especially if there is legacy C or assembly code that needs to be linked with. But by and large, computational physicists are Fortran users -- not because they are boring old guys, but because they are too smart to fall for the traps that the cool C++ kids run into. (Oh, and did I mention that Fortran 95 code is a lot easier to read and debug than C++ code? I have debugged both, and the difference is something like an hour versus a day to find a well-hidden off-by-one bug.)


Samuel Rocha de Oliveira said...

Good tip on using Python for interface. Thanks.

Mike Creutz said...

Ahh, flamebait! It is always possible to write bad code in any language, but C++ makes it easy. I like basic myself; indeed perl is just basic for snobs. Nowadays I often use bc for quick things.

Robert said...

I am not using numerical computations professionally myself so this comment is from a pure amateurish perspective:

My impression is that the main reason to use FORTRAN is because of legacy code and libraries. And that's a perfectly good reason.

The second reason about better compilers I think is much weaker: It's true, for true standard C the compiler is often lost when it comes to important optimisations. For example, strictly speaking if anywhere in your program (even only included) there is a statement such as *p=1.0; then from that point on the compiler must not assume it knows anything about any variables content, it just has no way to figure out to which address pointer p pointed to and under which other variable names this memory cell is known as well. Therefore all optimisations that would use information about what variable contains what (for example by reusing previously computed values or rearranging pieces of code that do not influence each other) must not be performed anymore becaus they could result in different from the expected behaviour.

BUT: Most of the time, these are theoretical concerns and the really bad things don't happen in real world programs. That's what compiler switches were invented for. You can tell your compiler that you do not try to make it fall and it can assume certain things it strictly should not assume. For example you can give a hint to your compiler that by modifying the address pointed to by a *char pointer will never change the contents of a double variable.

This and much much more is for example discussed in the gcc man page.

What I wanted to say is that with some hints good C compilers can do very good optimisations. But still it is much harder for a C compiler than a FORTRAN compiler to figure out that some code actually performs aa matrix times vector operation which on some machines can be transated into a single CPU command.

But, let's face it: FORTRAN code is ugly as hell and its syntax design is just paintful (let me just mention the default variable types based on names beginning with certain letters, ok implicit none turns that off, or common areas and the seventh column). It is much better in more recent versions than FORTRAN77 btu the few occasions where I had to use FORTRAN in my life always nearly made me cry.

Of course I have also spend many hours in my life searching for hidden obiwan errors.

And, as everybody knows, an experienced FORTRAN programmer can write FORTRAN programs in any language. For a nice essay see here(5 pages).

Georg said...

You're right that with the right user input in the form of compiler switches and #pragma's, a good C compiler can probably create code that is similar in efficiency to a similar Fortran code. But with Fortran, the user doesn't have to think about these issues at all, the restrictions inherent in the language definition do that for him. The C user may find himself in the situation of telling the compiler to make an assumption which is actually false (the code has changed, but some optimisation flag is still in the Makefile), which it will happily do, creating code which give wrong results or crashes.

As to prettiness, well, "pretty" and "ugly" are subjective notions; I personally don't find C++ code particularly pretty. Fortran syntax for mathematical expressions is much closer to normal mathematical notation than C, and the use of english words in many places where C has some special character makes the code more readable IMHO ("end do" and "end if" are clearer than "}" and "}"). Modern Fortran (the all-caps "FORTRAN" is no longer used for Fortran 90/95/2003) programs don't use fixed-form source code anymore, so columns 1-7 are just as ordinary as any other column, and common blocks have been superseded by modules. Implicit typing is probably the single worst feature of Fortran, but as you said it is easily turned off.

Of course a lot of legacy code is written in a rather dreadful FORTRAN77 style, which I admit is ugly. But modern Fortran 90/95 programs don't look like that, really.

Anonymous said...

Hmm I am surprised to hear that F95 has changed the tide. I have suffered f77 "compilers" doing nothing but coverly translate to gcc and compile it. Thus slower programs in fortran than in C, against your naive expectations.

Georg said...

Anonymous, I suppose you are referring to the f2c "compiler". I don't really know why anybody would use that, as virtually every high-end chipset vendor provides a Fortran compiler that is heavily tuned to their specific architecture. And it isn't as if those compilers were hard to find: The Intel Fortran compiler is even available for free for non-commercial academic use.

Anonymous said...

Not only f2c. For instance, a commertial f77 "compiler" for NeXT happened to have response times "similar" to f2c.

Anonymous said...

Quite a few Fortran compilers use ANSI C as an intermediate "opcode" (e.g. NAG), particularly on Unix. This does not prevent the Fortran compiler from optimizing the code!

Two things to consider:

* The C is compiled and optimized Fortran, not hand-typed C.

* The C compiler can be instructed to assume no pointer aliasing, and the Fortran compiler makes sure this happens. How to do this depends on the C compiler. It is either compiler pragmas, command line flags or the non-standard __restrict__ qualifier e.g. found in GNU C.

Naum said...

People, let's face the facts, a year ago I actually hated Fortran, to me the code looked too big, impractical, a bit ugly and the biggest minus was that I couldn't use all the blessings of object-based programming etc. THEN I started studying on first course, at astronomy, I was forced to write all of the programs on C++ and FTN, eventually I realized that for intensive computing tasks, FTN handles very well, at the end of the day for calculating reallllly big numbers I don't need the fancy interface of win or linux. All I need is a program that works fast.
p.s. The really strict syntax of FTN helps a lot when it comes to finding mistakes and writing without mistakes.

Anonymous said...

well, i am a physicist and when i do simulations i use c++. i do not think fortran is any better. i always use optimized (acml/mkl/atlas) routines for all numerically intensive tasks. i agree, however, that for instance complex*complex operations take a _lot_ of computation time. I believe this is not the case with fortran which already contains native 'complex' type.
bottomline: i am considering partly switch to fortran (9?) and see how it goes.

Praveen said...

Fortran is horribly ugly. I think I can never get used to its syntax. Particularly the lack of paranthesis. In C you can check in emacs for paranthesis completion. I've got around this with preprocessor definitions.

However, the arrays are easier to manipulate in Fortran. It has been very convenient to slice off data while creating initial conditions for new runs. But these are just cosmetic arguments.

Thasha said...

One common use of FORTRAN’s Equivalence is the following: A large array of
numeric values is made available to a subprogram as a parameter. The array
contains many different unrelated variables, rather than a collection of repetitions
of the same variable. It is represented as an array to reduce the number of names
that need to be passed as parameters. Within the subprogram, a lengthy
Equivalence statement is used to create connotative names as aliases to the
various array elements, which increases the readability of the code of the
subprogram. Is this a good idea or not? What alternatives to aliasing are

can someone help me with this.. i got tis question from my textbook and i need the answer for revision..
thank u...

Anonymous said...

hey can help me with the quest wic i gave u previously?

Georg said...

Dear Thasha,
this is a lattice QCD blog, not a CS homework help blog. Still, in general any use of the EQUIVALENCE statement is almost as good an idea as asking questions in this fashion, i.e. not so very good.

Thasha said...

sorry george..anyway thanx for ur kind reply..:)

Anonymous said...

You may look at winteracter for a fortran guibuilder.

Keith Miller said...

C'mon, FORTRAN is terrible. 14 years ago I was paid to convert engineering programs out of FORTRAN. Now I make my living as a scientist and engineer and all *new* computational analysis code is written in C or C++. Unfortunately we have some FORTRAN code too (due to acquisitions) and it's the scariest ugliest code I have had the misfortune of encountering. If you ask anyone I work with you'll get the answer that if we could wave a magic wand and turn all our FORTRAN code into C++ we would, but this stuff is just too ugly to contemplate touching. FORTRAN keeps its place in academia since old teachers used it, they teach it to students who themselves become teachers. We need to break the FORTRAN academic cycle and get some real world experience into that loop! :-)

Anonymous said...

If your primary niche in life is to do computations or analysis then Fortran is a language that can be easily learned and offer efficiency. I see no reason why anyone would try learning C or especially C++ when they have no intention of programming outside of computations. It takes precious time to develop proficiency in C++. Using Python for the GUI seems like a great idea. Anyone else have experience mixing Python and Fortran?

rayo said...

Try to deal with network/graphs in "fortcrap", it is a nigthmare.

Anonymous said...

You have to love the statement "just wait for Fortran 2003" given that this article was dated 2006. Not to mention that it is now nearing 2013 and the major Fortran compiler vendors have yet to fully implement the 2008 standard let alone the 2003 standard. That should be enough, but add to it the poor job that the Fortran standards committee is doing in playing catchup to established Object Oriented languages. Perhaps it would help if someone on the committee actually understood and experienced Object Oriented design. The evidence is overwhelming as so many in the computational sciences arena are resorting to implementing Object Oriented designs via Very High Level Languages such as Python and Ruby. Give me a break!

Anonymous said...

Fortran is great if you do scientific computation, more readable than C.
And this is crucial, because in the most of cases you modify codes written by others.
Fortran is not ugly, problems comes with goto or variables undeclared, but of course, if you do not know the basic of programming.. use mathematica!