… or Problems of C Programming Language
In my university, students are thought C as the first programming language - not only computer engineering program’s students, but also some of other engineering programs’ students. Although I had always liked writing code in C programming language, there are some reasons that why I think it is not a proper language to teach the concept of programming or even it is not a proper language to write most of the programs.
According to Wikipedia, C is a general-purpose, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system [5]. Sure it is procedural and imperative, but talking about purpose of language, though we must categorize C as a general-purpose language, who can advocate correctness of implementing every problem in C? C was originally designed to be a portable assembly language for easier implementation of UNIX [2]. Thus, it is a low-level programming language to use when hardware-intended or performance-critical software is needed.
I will mention about some problems or pseudo-problems of C programming language in this article. A very wide criticism of C programming language may be found at [6].
Readability
We do not need to say a lot about C and readability: C is not readable at all - don’t forget it’s a kind of enhancement to assembly language. Let us look at an example (example is taken from [1]). Have a look at code listings 1 and 2. First one is an implementation of a list of integers in C and second is Python equivalent of same code. We are not interested in how much readable Python code is, but we are interested in how much hard to read C code.
While writing code in C, you have to help compiler a lot. For favor of compiler you put braces ({ and }) to represent blocks. Stars here and stars there confuses the programmer (and then after code-reader). When you want to use some memory you have to allocate it and make sure all pointers are set to NULL initially. Also when you want to update the list, you must take care of all pointers again. Code becomes a whole mess full with statements those are not related with real problem and anyone who want to read and understand code finds (her/him)self inside this mess. (S)he has to fight with all these stars and braces and allocation statements and etc.
Have a look at table under Expressiveness section of [4]. According to this table you have to write 6 lines of C code to do same job with a single line of Python or Perl code. Also, lines of code needed to implement a problem in C is 2.5 times of lines of code to implement same problem in Java or Fortran. Thus, we can say C is not an efficient language for either writing or reading code.
Orthogonality
Have a look at two function definitions in code listing 3 (example is taken from [1]). The first function (double_int) takes an integer parameter and doubles its value and saves inside a local variable. After that value of local variable is returned. Second function (double_str) takes a character pointer as a parameter and doubles its value inside a local variable using some library functions. Like the first function, at last, value of local variable is returned back. But, second function cannot be even compiled because we are trying to return a local pointer variable.
Programmer occasionally does not have chance to write a similar code for two similar instances of same problem in C. For this reason, C is said to be not orthogonal.
Safety
C code tends to need more maintenance because of its unsafe properties. Some of these properties are:
- Though type checking is done by compiler, programmer is free to make type casting.
- C never and never makes index range checking. It is claimed that C encourages buffer overflows with this property in [2]. Same source gives a list of functions that may cause buffer overflow accidentally.
- Programmer has the memory! No safety check is done about memory allocation.
- You must not free same pointer twice - again accidentally, if you do you are in trouble. There is no internal mechanisms to avoid this. Programmer must always check the pointer while allocating and freeing memory.
Redundancy
[3] says “In many ways, the C language evolved into a collection of overlapping features, providing too many ways to say the same thing, while in many cases not providing needed features.” [2] has a lot of examples about that, I will not rewrite all of them. Just consider gets and fgets functions. They both do same job (OK, they don’t, but fgets does what gets can do) one is not encouraged due to some safety issues. But like a lot of similar things it cannot be fully thrown out of language, because backward compatibility is needed.
String type
In [2], James A. C. Joyce wrote about strings in C:
Most sane programming languages have a string type which allows one to just say “this is a string” and let the compiler take care of the rest. Not so with C. It’s so stubborn and dumb that it only has three types of variable; everything is either a number, a bigger number, a pointer or a combination of those three. Thus, we don’t have proper strings but “arrays of unsigned integers”. “char” is basically only a really small number. And now we have to start using unsigned ints to represent multibyte characters.
Since C has no string type you cannot do a string copying or string concetenation operation via its own syntax and you have to do this with help of functions. Hence, string operations are not a part of language, they are library functions and you must include appropriate header file to use them (string.h in ANSI C). Assignment of array variables is also not allowed inside C code, bacause an array is nothing but just a pointer. Also number to string or string to number conversion can be only done via functions. Look at those two syntax types - no comments.
C style:
strncpy(source + 2, target, 5);
Python style:
target = source[2:7]
Besides all, C is a low-level programming language. It is not a string manipulation language. Expecting high-level string manipulation operations from it is not reasonable.
Reaching elements
Another complaint about C is why do we have both . and -> to use for the same purpose. Firstly, they are not for the same purpose. . is structure offset and -> is used for dereferencing. Sure we should expect from compiler to take care of this difference or we can simply do not want to help the compiler to ease its job. But in my opinion understandability of code increases with this small difference. Looking at the code we can easily see what is a pointer and we are dereferencing its indigrents and what is a name of a real structure. Let me say I never liked Java way of avoiding pointers.
goto statement
James A. C. Joyce claims that using a goto statement is the only way of breaking out of nested for or while loops in [2]. I am not sure his claim is true but I don’t know a more efficient way of doing this either. [3] says 90% of goto statements used to break out of nested loops by investigated 100,000 lines of code. A number of other languages uses multi-level breaking to avoid this.
Perhaps goto statement is the worst feature of C language. You can go somewhere in a loop / nested loops using it. You can make your code ten times hard to read using a single goto statement inside a loop.
enums
C’s enum structure has a very significant and important problem which may be easily solved with object-oriented programming. If you use a name in an enum, you cannot use this name in another enum. If you use an object-oriented programming language like Java, you may put the same constant name inside different classes.
Error handling
Error handling may be done (is done by library functions) in the following ways (complete list is taken from [2]):
- Returning zero
- Returning nonzero
- Returning a
NULLpointer - Setting
errno - Requiring a call to another function
- Outputting a diagnostic message to the user
There is no exception handling mechanism in C. This results in two critical problems: First is what we know already: C is not appropriate for high-level programming. Secondly, since there is no single error code convension programmer gets confused when writing code.
What C doesn’t have
- Exception handling mechanism
- Specialized data types
- Function overloading
- Garbage collection
A nice joke :) [2]
“Hey, Thompson, how can I make C’s syntax even more obfuscated and difficult to understand?”
“How about you allow5[var]to mean the same asvar[5]?”
“Wow; unnecessary and confusing syntactic idiocy! Thanks!”
“You’re welcome, Dennis.”
Conclusion
C is not the evil in this story, I think. Just, it is not really proper for high-level programming. It must be used for what it is designed for.
Some bloody properties cannot be abandoned due to backward compatibility. Newbies always discouraged using these features by experienced programmers.
While coding in C, I feel myself in the middle of 70s while memory was so important and that I shouldn’t use a single byte if I really don’t need it. This makes me sick about C :)
I think it is something related with comfort and habits. For example Fortran programmers have complaints about ability to change the loop variable inside the loop, but I cannot even dream about a world where I cannot change it :) Java programmers find pointers confusing, I am confused when I don’t see those stars inside code :)
To sum up, C or its features are not real problem, problem is using it where not to use. Never forget: C is not a vitamin that is useful in every condition, it is a low-level language!
References
[1] http://www.ce.itu.edu.tr/undergraduate/courses/blg437e/presentations/introduction.pdf : Programming Languages - Introduction by H. Turgut Uyar
[2] http://www.kuro5hin.org/story/2004/2/7/144019/8872 : Why C Is Not My Favourite Programming Language by James A. C. Joyce
[3] http://java.sun.com/docs/white/langenv/Simple.doc2.html : The Java Language Environment
[4] http://en.wikipedia.org/wiki/Comparison_of_programming_languages : Comparison of programming languages
[5] http://en.wikipedia.org/wiki/C_%28programming_language%29 : C (programming language)
[6] http://en.wikipedia.org/wiki/Criticism_of_the_C_programming_language : Criticism of the C programming language
Code listings
[1] http://www.ozgurmacit.com/files/Is-C-a-vitamin-01.c
[2] http://www.ozgurmacit.com/files/Is-C-a-vitamin-02.py_
[3] http://www.ozgurmacit.com/files/Is-C-a-vitamin-03.c
Hi Özgür,
Your critisism to the C programming language makes me very angry :) Now i defend it with robust proofs.
Firstly, readibility in C is more difficult than object oriented programming languages. It is true but you should compare C’s readibility with a procedural language. Python is an OO language and everybody accepts that fact OO languages are more readible than procedural languages. Sometimes you couldnt escape from using a procedural language because of performance issues. At this time you will see that C is the best readible when you really learn to use it.
Secondly, orthogonality is wanted because it makes the programmer’s job easier. However, the example you gave is a little strange, isnt it? Look at first example, you double only one variable which is an integer. Look at the second example, you double an array of chars! They are not the same. In the first one your tendency is increasing the memory locations content while in the second increasing the memory length with double! So it can not be a criterion for determining the C’s orthogonality.
Another issue you mention is safety. If the language makes all boundary controls, memory managements then this language will be slow like java :) My attitude is that C is as safety as another programming language. However if the programmer doesnt care about security issues C may be becomes a kamikaze. I think it is not right to assign the results of programmers faults to a programming language!
Redundancy is the problem of all languages which live for long time. Deprecated functions also exist in Java and another languages. Saying the same thing in many ways proves the flexibility of a language. It is an art.
In reaching elements part, you explain the goto statement. It is not required to use all the features of a programming language. If it is really required you can use goto, it is a tool in your toolbox but we generally prefer break and continue keywords in loops. Have you ever used the “auto” keyword? Most of the programmer doesnt know about it, but it exists in our toolbox, use it if you like!
You say;”There is no exception handling mechanism in C.” Lets examine first what is exception handling? When you take the code inside a try block and catch the throwing exception you can handle it inside the catch block. The same thing can be done in C. Look at this code:
function( input1 , input2 , &errorCode );
After returning this function call, you know about the status of system. You can look the errorCode variable, say if it is 0, the function is successfully executed. You can construct your own handler by this way. It just require a little much design.
Finally, 5[var] is same with var[5], because it is a simple mathematic that adding 5 with var, which is an address of the array, and adding var with 5 produce the same result. For more information look at http://emreknlk.blogspot.com/2007/02/advanced-pointer-operations-in-c.html :)
So my conclusion is; use C if you want to feel like a programmer, use orange juice if you want to take some vitamin!
On the other hand, using C as a learning language forces students to learn some very important lessons, such as memory management. Often people who learn first with higher-level languages are terrible at certain details, such as performance and memory consumption. Personally I found it best to learn computer science from the ground up. Even though I don’t ever program in assembler it was very educational and helped build a strong foundation.
emreknlk,
I must first say that my article was not about what a bad language C is - I am just againist programming language fanaticism.
My article does not tell anything about how readible object-oriented languages (or Python) is, I am talking about C and how difficult to read (and write) C code is. You are forced to write (and read) a very big amount of irrelevant code while coding in C language. Thus, C is not proper for implementing high-level and non-performance-seeking problems.
I cannot understand what you say about orthogonality. What I must see from an orthogonal programming language is that it must take care of whether my variable is an array or an integer - I don’t want to deal with how it is located in the memory. Just remember I assumed that I am writing a high-level program that is not related directly (for example a device driver) or indirectly (any performance-critical application) with computer hardware.
The right thing to say about C’s safety issues is (or in general about C), I think, it is related with problem you have to solve. Do you need good performance? Do you need a bug-free software? Don’t you want to take care of all machine-related things? Don’t you want your code include
ifs here andifs there to catch any error that may occur?Your way about exception handling is nice - it is really. But you know -like I know- that exception handling is something very different. I don’t expect from C to have an exception handling mechanism as a low-level language, I am just trying to say it is not a handy language for high-level programming. No need to say you can do nearly everything in C by yourself. You are familiar with do it yourself approach as I know :)
So my conclusion is, again, use C where you really need it, but no more than this.
C Forever!!!
Hocam Unit Testing miş
C imiş
Vitamin miş
hepsi hikaye
Site super olmuş :)
Eline Sağlık
C gerçekten ilk öğrenilmesi gereken bir dil çünkü yazılımcı detaylarda nelerin saklı olduğunu görüyor. gidip ilk önce java öğretirsen ne pointer mantığını ne dynamic memory management mantığını ne compile-link-load cycleını adam gibi kavrayabilir. önemli olan doğru dili doğru amaçla doğru yerde kullanmaktır. Bunun için mesela java JNI (java native interface) ile C fonksiyonlarınızı java kodunuzdan çağırmanıza izin verir. Ama mesela gidip de C ile bir distributed enterprise uygulama yazamazsınız ille ki daha high level bir dil kullanmanız gerekir. benzer şekilde de bir driver yazarken de daha düşük düzeyli bir dil kullnmak mantıklıdır…