Let’s compare Java and C#, two programming languages with large numbers of ardent fans and equally virulent detractors. Despite all the buzzing online (“I’m about to rant. Who else hates working in C#?” one blog might complain, even as another insists: “Java ruined my life.”), it’s hard to find real-life benchmarks for each language’s respective performance.
What do I mean by real life? I’m not interested in yet another test that grindingly calculates a million digits’ worth of Pi. I want to know about real-world performance: How does each language measure up when asked to dish out millions of Web pages a day? How do they compare when having to grab data from a database to construct those pages dynamically? These are the kinds of stats that tech folk like to know when choosing a platform.
Before we get started, we need to establish some terminology. When you write Java code, you usually target the Java Virtual Machine (JVM). In other words, your code is compiled to bytecode, and that bytecode runs under the management of the JVM. C#, meanwhile, generally runs under the Microsoft Common Language Runtime (CLR). C# is similarly compiled to bytecode.
Java and C#, then, are really just languages. In theory you could write Java code that targets the Microsoft CLR, and you could write C# code that targets the JVM. Indeed, there are several languages that target the JVM, including Erlang, Python, and more. The most common languages targeting the CLR (in addition to C#) is Microsoft’s own Visual Basic.NET, as well as their own flavor of C++ called C++.NET. The CLR also offers support for several less-common languages, including Python and Microsoft’s own F#.
Further, the two runtimes include frameworks that are a set of classes written by Oracle/Sun and Microsoft for the JVM and CLR, respectively. Oracle has its Java Platform, along with various APIs. Microsoft’s .NET framework is a huge set of classes supporting development for the CLR; indeed, most people simply refer to the system as .NET rather than CLR.
As such, we need to lay some groundwork for what we’re trying to accomplish. First, we’re not really comparing the languages themselves. What we need to compare is the underlying runtime. But even more than that, we need to also compare the performance of the frameworks. Therefore I’m going to do multiple comparisons, but ultimately try to match up apples to apples.
For example, it’s very possible to write your own HTTP listener in either C# or Java, and just send back an HTML page generated dynamically. But the reality is, almost nobody actually writes a low-level HTTP listener; instead, they tend to use existing HTTP servers. Most C# web apps rely on Microsoft’s IIS server. Server-side Java, on the other hand, can work with several different servers, including the Apache HTTP server and the Tomcat server. (Tomcat, for example, was built specifically to interact with server-side Java.) While we want to compare apples to apples, we want to stay realistic. The servers will very likely play a role in the responses, as one might be faster than the other. Even though the HTTP servers are not technically part of the runtime, they are almost always used, and will therefore play a factor—that’s why, after a first test in which we skip those servers and write our own small HTTP servers, we’ll try similar tests with the respective HTTP servers to get a more complete and accurate picture.
A Quick Note on the Hardware
I want to make sure the hardware in question introduces as few extraneous variables as possible. My own development machine has a ton of software on it, including many services that start up and steal processor time. Ideally, I would devote one entire core to the Java or C# process, but unfortunately the core allocation works the other way; you can limit a process to a single core, but you can’t stop other processes from using that core. So instead I’m allocating large servers on Amazon EC2, with close-to-barebones systems. Because I don’t want to compare Linux to Windows, and C# is primarily for Windows (unless we bring Mono in, which we’re not), so I’ll run all tests on Windows.
On the client end, I don’t want network latency to interfere with the results either. A moment of slowness during one test would throw off the results. So I made the decision to run the client code on the same machine. While I can’t force the OS to reserve cores to a single process, I can force each process into a single core, which is what I did.
Collecting the Results
The results are timed on the client side. The optimal way to do this involves capturing the time and saving it, capturing the time again as needed, and continuing that way, without performing any time calculations until everything is done. Further, don’t print out anything at the console until all is done. One mistake I’ve seen people make is to grab a time at given points, and also at each point calculate the time difference and print it to the console. Consoles are slow, especially if they’re scrolling. So we’ll wait until we’re finished before calculating the time differences and writing to the console.
The Client Code
It doesn’t really matter what we use for the client code as long as we use it consistently in all tests. The client code will mimic the browser and time how long it takes to retrieve a page from the server. I can use either C# or Java. I ended up using C# because there is a very easy WebClient class, and an easy timer class.
First Test: Listening for HTTP
Let’s get started. The first test will simply be code that opens an HTTP listener and sends out dynamically generated Web pages.
First: the Java version. There are many ways we can implement this, but I want to draw attention to two separate approaches: one is to open a TCP/IP listener on port 80, and wait for incoming connections—this is a very low-level approach where we would use the Socket class. The other is to use the existing HttpServer class. I’m going to use the HttpServer class, and here’s why: If we really want to track the speed of a Java compared to C#, without the Web, we can run some basic benchmarks that don’t involve the Web; we could create two console applications that spin a bunch of mathematical equations and perhaps do some string searching and concatenation—but that’s a topic for another day. We’re focusing on the Web here, so I’ll start with the HttpServer, and similarly with the equivalent in C#.
Right off the bat I find what appears to be an anomaly: the Java version takes almost 2000 times as long to complete each request. Processing 5 requests in a row takes a total of 17615 ticks when retrieving a string from a CLR program that uses the HttpListener class, whereas processing 5 requests to the Java server running the HttpServer class takes 7882975 ticks. (When I switch to milliseconds, I see numbers such as 4045 milliseconds to process 15 requests on the Java server, and only 2 milliseconds to process 15 requests on the C# server.)
Adding some debugging info to the Java server, I discover that the function responsible for responding to incoming requests and sending out data actually runs quickly—nowhere near the three seconds or so being reported. The bottleneck appears to be somewhere in the Java framework, when the data is sent back to the client. But the problem doesn’t exist when communicating with the C# client.
To get to the bottom of this one, I decide to switch to a different Java client. Instead of using the heavier HttpServer class, I instead create a simple TCP/IP socket listener using the ServerSocket class. I manually construct a header string and a body that matches what I’m sending down in the C# version.
After that, I see a huge improvement. I can run a large number of tests; I perform 2000 requests, one after the other, but not gathering the time until the 2000 calls to the Java server are finished; then I perform a similar process with the C# server. In this case, I can use milliseconds for the measurement. Calling the Java server 2000 times takes 2687 milliseconds. Calling the C# server 2000 times takes 214 milliseconds. The C# one is still much faster.
Because of this discrepancy, I feel compelled to try out the Java version on a Linux server. The server used is a “c1.medium” on Amazon EC2. I install the two different Java classes and see essentially the same speeds. The HttpServer class takes about 14 seconds to process 15 requests. Not very good.
And finally, to be absolutely sure, I write an equivalent client program in Java that retrieves the data. It records similar times as well.
Second Test: Full Website
It’s rare that people roll their own HTTP servers. Instead, C# programmers usually use IIS; Java programmers have a few choices, including TomCat. For my tests I’m going to utilize those two servers. For C#, I’m going to specifically use the ASP.NET MVC 4 platform running on IIS 8. I’m going to take two approaches: first, returning a string of HTML from the controller itself; for the second I’ll return a view that includes a date/time lookup.
For the Java tests, I can do two similar approaches. I can have a servlet return some HTML, or I can return the results of a JSP page. These are analogous to the C# controller and View approaches, respectively. I could use the newer Java Faces or any number of other frameworks; if you’re interested, you might try some tests against these other frameworks.
The C# controller simply returns a string of HTML. Running my client test for 2000 iterations sees a time of 991 milliseconds total. That’s still faster than my Java socket version.
The view version of the C# app creates a full standards-compliant HTML page, with an HTML element, head element, meta element, title element, body element, and an inner div element containing the text “The date and time is” followed by the full date and the full time. The date and time are retrieved through the DateTime.Now instance, and filled in dynamically with each request.
Running the client test for 2000 iterations against this view version takes 1804 milliseconds; about twice as long as the direct one. The direct one returns shorter HTML, but increasing the size of the HTML string to match the view version shows no difference; it hovers around the 950-1000 millisecond time. Even adding in the dynamic date and time doesn’t result any noticeable increase. The view version takes twice as long as the controller version, regardless.
Now let’s move on to Java. The servlet is just as simple as the controller in the C# version. It just returns a string that contains an HTML page. Retrieving 2000 instances takes 479 milliseconds. That’s roughly half the time as the C# controller—very fast indeed.
Returning a JSP page is also fast. As with C#, it takes a bit longer than the controller. In this case, retrieving 2000 copies takes 753 milliseconds. Adding in a call in the JSP file to retrieve the date makes no noticeable difference. In fact, the Tomcat server apparently performs some optimization, because after a few more requests, the time to retrieve 2000 copies went all the way down to 205 milliseconds.
These results are quite interesting. Having worked as a professional C# programmer for many years, I’ve been told anecdotally that .NET is one of the fastest runtimes around. Clearly these tests show otherwise. Of course, the tests are quite minimal; I didn’t do massive calculations, nor did I do any database lookups. Our space is limited here, but perhaps another day soon I can add in some database tests and report back. Meanwhile, Java is the clear winner here.
Image: Bjorn Hoglund/Shutterstock.com