From the QWANtify Blogs
Performance Tuning – So why is it slow
So now you have some data (from your tests) and you know what the data should tell you (your goals and other’s expectations). If you are still reading this then your data is probably not where it should be. So, what do you do? How do you succeed? I use a scientific method. First, you take your data and you identifyONE and only ONE problem point. Sit back, think about what may be causing it, but DO NOT CHANGE THE CODE TO FIX IT YET! See if you have data that shows you what is going wrong. Remember, you should have more than one metric to show you exactly where the problem is. If you don’t have enough metrics, then add more. This is where Meta Programming, AOP, and Dependency Injection are very valuable. I want to have the least impact on the code.
So, now you have data that tells you exactly where the problem is. There are probably several things that could be causing the problem. DONOT FIX THEM ALL AT ONCE. Fix one and only one. Test it several times to prove it is fixed. Then, undo that change and prove it is broken again. Then put the fix back in and validate that it still works. Yes, this means you are going to be doing a lot of tests. Welcome to science. To me, this is where programming really is a science. You need to treat it as such, which also means communicating all of your data. Saving all of your data and being able to compare test runs against any other test run. Excel can work wonders here or create a small database app that can read in all of your performance data and analyze it. Have it write out to Excel so that you can generate graphs or a command line graphing tool like GNUPlot. Yes, writing that app will take a bit, but you will be doing so many tests that it will pay out in the long run. It’s not just the amount of time it will save you, either. It will reduce errors and allow you to stay focused on the big issues. During this time, I often find myself climbing into my own little world and only dealing with others when I need something from them. When you realize that you are doing that, stop. Get up and go find someone to talk to for a while. Go read /. or something. When solving big problems like this, you often need to walk away from the problem for a while before your find the answer. If you want to understand this more, I recommend The Pragmatic Programmers book Pragmatic Thinking and Learning: Refactor Your Wetware. It explains why you need to get out of your L Mode. If you don’t know what that means, then read the book.
So, which problem do you try to fix first? The one that is taking up the most amount of time, of course. Oh, sure, you agree with me now, but too many times I have seen people say, “Well I don’t know how to fix that one yet so I’ll go solve these little problems with caching right now. That is a simple change to my Spring Config.” DONOT DO THAT! Once you have fixed your biggest problem the application will behave differently. Those changes you were going to do before may not be needed now. Remember, this is real science and science takes time and does not respect a time line. So, don’t waste your time on changes you may not have to make, or worse, introduce new bugs or performance issues. Remember, performance tuning is like Ogres and each layer can be a vastly different Ogre. Don’t be afraid to do big massive changes to the system. Just don’t do them unless you have a lot of data to back up the decision. You should have designed the system so that you can make changes easily. You also should be able to tell management how confident you are in the change based on the data you have.
Next time, I’ll talk about the side affect of being scientific, and being human.
-Kevin
Filed in: Team Member Blog
March 22, 2009· by Josh Swan
Today we deployed an application to our QA server and found we were getting an unusual error with hibernate. The error was:
org.hibernate.LazyInitializationException: failed to lazily initialize a collection, no session or session was closed
After debugging it for a while, I figured out it was occurring when the Drools rules engine was trying to access a lazily loaded hibernate field on one of the hibernate objects that was passed into it. At first I thought maybe it was some kind of multi-threading problem since that might cause the session to be unavailable in a separate thread from the main thread. After a bit a googling I found that answer. It turns out Drools uses shadow facts that results in the hibernate objects being copied in memory. Well the hibernate session for the objects cannot be copied so when the copies are used it appears the session is not set or has been closed. To fix the problem I disabled the shadow proxies with the following code:
RuleBaseConfiguration conf = new RuleBaseConfiguration();
conf.setShadowProxy(false);
RuleBase ruleBase = RuleBaseFactory.newRuleBase(conf);
…
Shadow facts should only be disabled, however, if you follow the following rules.
- Your fact classes are immutable.
- Fact changes in the rules are done only in modify() blocks.
- Fact changes in your application are only done in modifyRetract() or modifyInsert.
You can find more information about shadow facts at this site: http://blog.athico.com/2008/02/shadow-facts-what-you-always-wanted-to.html.
Filed in: Team Member Blog
I was listening to the radio this morning while getting ready for work, and I heard that the EPA had designated March 16th through the 20th as “Fix a Leak Week”. My first thought was, “Oh great, another national something pitch by some group trying to garner media attention”, but as I drove in to work… (read more)
Filed in: Team Member Blog
A couple of months ago now, our office manager talked with me about hiring an MATC student she personally knew to assist in our office.
Filed in: Company Insight
After spending so much time working with software, I’ve recently had the opportunity to test my skills on building hardware. One of my birthday gifts this year was a build-it-yourself analog synthesizer kit. Analog synthesizers were very popular before digital synths took over in the 1980s. The Moog was one of the most popular analog synths. If you’ve heard any pop music from the 1970s you’ve probably listened to one.

Before this project, my experience with a soldering iron only involved desperate attempts at salvaging out of production headphones. I’d never worked with a PCB (printed circuit board) before. As usual, Google was a great help in finding basic tutorials. After orienting the dozens of resistors, chips and switches on the board, I had to solder about 100 connections that were only about a millimeter away from one other. This process gave me a much greater sense of just how sensitive the guts of my laptop are; almost any mistake will short a circuit and make the entire device unusable.
Working with hardware definitely requires much more planning than software. There were several times I wanted to punch an Apple-z to undo what I’d just done. As a software guy, I’m used to trying to create something useful in several different ways, then integrating the best ideas from each into a final design. Once wires are cut, that’s not always an option.
Next, I’ll need to mount my components in an enclosure. I hope I’m as good with a drill as I am with a soldering iron…
Filed in: Team Member Blog
Or rather, my inner programmer.
I make the distinction because, aside from a complete lack of social skills, I can’t really lay claim to any of the other attributes that make a geek a geek. I’m not a gamer, trekker, or Star Wars fanboy. I don’t read/watch sci-fi or fantasy (with the exception of LOTR, duh), and I have never taken up residence in my parents’ basement. However, I do loves me some coding. I have always gotten a lot of satisfaction out of solving problems by writing computer programs. Now, if I could only write one that gets my 2 yr old to sleep better, I’d be ecstatic, not to mention rich.
Of late, I’d been trying to diversify my interests by playing guitar, reading, and one or two other things. After starting and stopping these activities and generally not being able to generate much interest in doing them, I realized that I just need to give up and accept the fact that what I really enjoy doing is programming. I think I’ll still plunk around on my guitar because I spent too much money on it to just drop it, but I’ve pretty much given up any ambition of becoming a musician. Not because I couldn’t, but because I just don’t want to. Now that I’m in my mid-thirties, I’m realizing that I need to be honest with myself and just enjoy the things that I enjoy and stop trying to be something I’m not. I think I’ll be happier in the end if I can do that successfully.
So I’m off to learn Spring, Maven, and RoR.
Filed in: Team Member Blog
Performance Tuning – Your first few tests
So, now you know how to prove what you think you know. You know your goals and what others are expecting. So let’s start testing, right? NOPE! I know, I know. Kevin, when do we actually start testing? How are we going to get any where if we don’t start testing? Well, now that you know where you are going (your goal) and any side trips you have to make (the other expectations) you need to be able to tell that you have accomplished all of these things. In other words, what metrics do you need and how are you going to record them? So, for each goal and expectation, what metrics are you going to use? Yes, I said metrics. You can’t assume one metric will be enough. After all, you don’t know what you don’t know so you need other metrics to validate the metric you are using to be able to check off each goal and expectation.
For example, if you are creating a web based application, you are probably using a testing tool to generate various user loads to see how the system performs. How do you know that tool is the right one? Well, if you are using Apache you can set up the logs to report response time. (NOTE: The Apache response time includes sending the data to the user.) So, if the tool does not agree with the Apache logs, you have an issue. Is the tool downloading other content, like images, CSS, and JavaScript, that you are not seeing in your Apache logs? Does rendering the page in the browser take a long time? Maybe the page is too complex with lots of tables and such or maybe your tool is having problems. Maybe it is doing too many requests per server. What is the CPU, network, disk, and memory usage stats on the testing boxes?
Notice something? That one metric of page response time has suddenly turned into several more metrics. This is why we are not testing yet. You need to look at each metric and figure out how you will validate it and, when something is wrong, how you will break it down even further. Going back to the web page example; if page requests are taking over 5 seconds, what is really taking the time? If the tools, metrics and the Apache logs are close then the problem isn’t at this level and you need to dig down deeper. If Apache hands off the page request to an application sever (like what often happens in Java Web Apps) can you tell how long the application server is taking. Remember, I said the Apache response time included the time to transfer the data. Maybe you have a network bottleneck. So in your application code, log out how long you think it takes you to process a page request. Don’t reinvent the wheel here either. There are often great tools you can use likeJAMon for Java. It even has a Servlet Filter you can use to monitor your web application that will generate stats. But wait, Kevin. I have a nice Network usage chart that shows there isn’t a problem. Does that chart show every switch and router in between the tool and the test server? Probably not. Again, you don’t know what you don’t know so find ways to prove what you think you know. Don’t assume anything. Getting paranoid yet? No. No one told me to say that. If you got that, then maybe, just maybe, you are starting to realize you don’t know what you don’t know.
Now, you don’t need to implement each metric right now. Make sure you have ways to record each major metric. Don’t waste time putting in additional metrics now. Remember, performance tuning is like an Ogre. As you peel back each layer, you will be adding more metrics. For now just think about what other metrics you might use and find out how to get more. Look at what additional metrics each tool,API, and component can give you. If possible, get people to design how they would add the other metrics.
Now run some tests. Run the same test 3 or 4 times. Don’t make any changes! Just run the tests. Then, look at the results. Are your results the same across all the runs. If they aren’t, don’t worry. It just means you don’t know what you don’t know. Take the metrics that are changing and come up with new metrics to validate those metrics and drill down into what is really going on.
Finally, make sure you know how much your metrics cost you. Any time you observe something (record data), you affect it. If you run your code in a profiler, it runs significantly slower. It will make out-of-process calls look faster because they are not running in the profiler. If you have to, run a test with and with out your metrics using a wall clock to check if the tests take roughly the same amount of time. If they don’t, you need to compensate for that.
Next time, I’ll talk about how to run the tests and, once you’ve found a problem, what do you do.
-Kevin Runde
Filed in: Team Member Blog
I read a very interesting article this past week by Martin Fowler about technical debt. Fowler explains Technical Debt, a term originally developed by Ward Cunningham, this way:
“In this metaphor, doing things the quick and dirty way sets us up with a technical debt, which is similar to a financial debt. Like a financial debt, the technical debt incurs interest payments, which come in the form of the extra effort that we have to do in future development because of the quick and dirty design choice.”
I’m sure every software developer can relate to this problem Except in rare cases, it’s necessary for a long term project or application to incur some debt due to time or implementation constraints. The key is how this debt is managed going forward, if at all. Some will choose to build upon an already faulty architecture, incurring even more debt which will have to be dealt with down the road. Others will make small, quick refactoring changes to slightly improve the situation. The worst case scenario would involve the debt getting so bad that massive refactoring must be done, costing a company more money than if they would have taken more initial development time.
The other factor to consider with technical debt is the fact that the original developers may not be around to “pay it off”. It could fall to someone else unfamiliar with the original design decisions and overall architecture, which can lead to even more cost.
The big question is how to best mitigate and handle technical debt. I’m curious to know what sort of approaches others have seen taken. How much debt is too much? Who is responsible for making sure this debt is repaid and doesn’t affect the project/company going forward? Should it be the managers or the developers? And how much time should be spent fixing old code? Please share your thoughts.
[1] http://martinfowler.com/bliki/TechnicalDebt.html
[2] http://en.wikipedia.org/wiki/Ward_Cunningham
Filed in: Team Member Blog