Possible Duplicate:
I’ve inherited 200K lines of spaghetti code — what now?
Recently I got hired to work on existing web application because of NDA I’m not at liberty to disclose any details but this application is working online in sort of a beta testing stage before official launch. We have a few hundred users right now but this number is supposed to significantly increase after official launch.
The application is written in PHP (but it is irrelevant to my question) and is running on a dual xeon processor standalone server with severe performance problems. I have seen a lot of bad PHP code but this really sets new standards, especially knowing how much time and money was invested in developing it.
- it is as badly coded as possible there is PHP, HTML, SQL mixed together and code is repeated whenever it is necessary (especially SQL queries). there are not any functions used, not mentioning any OOP
- there are four versions of the app (desktop, iPhone, Android + other mobile) each version has pretty much the same functionality but was created by copying the whole code base, so now there are some differences between each version and it is really hard to maintain
- the database is really badly designed, which is causing severe performance problems also for fixing some errors in PHP code there is a lot of database triggers used which are updating data on SELECT and on INSERT so any testing is a nightmare
Basically, any sin of a bad programming you can imagine is there for example it is not only possible to use SQL injections in literally every place but you can log into app if you use a login which doesn’t exist and an empty password.
The team which created this app is not working on it any more and there is an outsourced team which suggested that there are some problems but was never willing to deal with the elephant in the room partially because they’ve got a very comfortable contract and partially due to lack of skills (just my opinion).
My job was supposed to be fixing some performance problems and extending existing functionality but first thing I was asked to do was a review of the existing code base. I’ve made my review and it was quite a shock for the management but my conclusions were after some time finally confirmed by other programmers.
Management made it clear that it is not possible to start rewriting this app from scratch (which in my opinion should be done). We have to maintain its operable state and at the same time fix performance errors and extend the functionality.
My question is, as I don’t want just to patch the existing code, how to transform this into properly written app while keeping the existing code working at the same time?
My plan is:
- Unify four existing versions into common code base (fixing only most obvious errors).
- Redesign db and use triggers to populate it with data (so data will be maintained in two formats at the same time)
- All new functionality will be written as separate project.
- Step by step transfer existing functionality into the new project
- After some time everything will be in the new project
Some explanation about #2, right now it is practically impossible to make any updates in existing db any change requires reviewing whole code and making changes in many places.
Is such plan feasible at all?
Another solution is to walk away and leave the headache to someone else.
6
I faced the same situation with a Java app.
This is what I did:
I used the “fit a mountain in a tea cup” approach. This means you actually don’t put the mountain in the tea cup, insteada you dig for the diamond, then put the diamond in the teacup.
I identified the core functionality of the app, the most frequently used feature, the functionality that is responsible for the most part of duplicate code. I’m talking business logic only. Then I reengineered that business logic using OOP and tested it thoroughly making sure it did the exact same thing the old code did ( minus errors, plus performance improvements). Then I painstakingly went into every part of the code where the old logic was used and bypassed it making sure the new OOP, reengineered logic was called instead of the old one. Obviously the new code ended up having considerably less code lines.
Then I went to find my second diamond.
After three of four diamonds more. The very core of the application, under the hood was fresh and better. With the old code only running the less used, unimportant functionalities.
I didn’t re-wrote the whole app. That would have been trying to actually fill the whole mountain in a tea-cup.
1
It’s difficult to assess the best approach without understanding your environment and the relationship between management and engineering. It is also pretty difficult without knowing what kind of resources management is willing to give you to accomplish the task.
Before you get into fixing the code, you should have a strong understanding and vision of how you want to succesfully release this project. Usually bulleted points are the best way to communicate these things.
Preparation.
- if management will let you, create a team of talented and diverse engineers
- make sure the team is small (2-4 engineers) – nothing derails a team like administrative overhead
- have management’s buy in to let you (or your technical lead) run your team and have issues be fronted by you or your technical lead
- pick an engineer with strong database skills, orm skills and data scientist skills (know your data/content)
- pick an engineer with good php/web/development skills
- pick an engineer that is a jack of all trades, master of none
- pick an engineer who has a good tooling background
- pick an engineer with release management experience
- a quality engineer for me is one who can learn and adapt, rather than one who has memorized an api
- one engineer can fulfill several roles mentioned above
- create a strong team vibe (balance hard work with fun down time)
- organize your team to be agile
- do daily stand ups (scrums)
- do weekly iteration planning meetings (at first, bi-weekly as you find your rhythm)
- commit very frequently
- commit very small changes, but quickly
- identify your development life cycle and setup your continuous integration
- pick a versatile SCM that your engineers are comfortable with (i recommend git, but if your engineers know SVN then git can have a high learning curve – the key here is that the majority of your engineers should agree and be comfortable with it)
- pick a CI tool (hudson/jenkins/bamboo/many others) and get it going
- pick a bug tracker that integrates with CI, SCM and get it going
- enforce unit and integration testing
- make sure unit and integration tests are run in CI – you get quite a bang for your buck in unit tests
- unit tests/integration tests can be a source of documentation when code is not always good
- unit tests/integration tests help you fail fast
- you will need to refactor a ton of code … unit tests will save your skin here
- setup standards for the team: no more than 2 consecutive broken builds, etc, etc
- figure out a deployment strategy to deploy to test and production
- this will allow you to test your packaging and deployment
- allows you to test upgrading scenarios
- if you have a well defined deployment strategy, the cycle between new feature/bug fix development and production can be shortened quite a bit
Now the code.
- in general, by your description, you have a client/server stack where the server stack is composed of a database, a middle tier (running the PHP/SQL framework) and then the client browser
- note that in this scenario, small changes at the bottom of the stack (database) will ripple up through the stack and have a big impact on the presentation layer
- given how brittle you have described the code as being, the ripple effect will be intensified
- this will not be very manageable in discrete, small chunks if you start with the database layer
- you will also have to have a strong understanding of all the code before you can make any changes
- it’ll be the deer in the headlights scenario
- recommendation
- you had a win with management (i think) – with your code base analysis
- now you need a win with management to show your ability to execute
- identify low hanging fruit – there should be quite a few based on your description
- plan a way to fix and implement fixes to the low hanging fruit, even if they are hacks
- this will bring a win and will encourage management to give you the resources you need
- it will alleviate pressure from the customer base and hold them at bay, while you and your engineers work in parallel to refactor the code base
- start identifying and dissecting the client and middle tier with the following goals:
- it will make your understanding of the system much better
- you will want to parse your understanding into small, discrete chunks of features that can be changed without a huge ripple effect
- because you’re not yet changing database code, there won’t yet be large ripples and you can run fast
- as you move through this, identify database changes that have to be fixed and create requirements/issues around and assign them a priority
- identify dependencies / relationships between modules
Make sure you have the release management, SCM, CI, scrums and other tools in place as fast as you can. Once you have momentum, and trust me you’ll get it, the most difficult thing is stopping because you are not sure how to proceed or you do not have enough bandwidth to take care of these all-important but none-the-less side issues.
Hope this helps.
1
Management made it clear that it is not possible to start rewriting this app from scratch
This is your main concern. You can only patch it as you go along. Your plan involves a rewrite, but not a live one so you’re asking management to pay for this and also the maintenance of the old version.
I can only see you chasing your tail as changes are applied to the old system they’ll probably be incompatible with your new one and so you’ll just end up in an endless cycle of refactoring.
2
Management made it clear that it is not possible to start rewriting this app from scratch (which in my opinion should be done). We have to maintain its operable state and at the same time fix performance errors and extend the functionality.
Management can state their requirements, but if it won’t work, it won’t work. If I was you I would make clear to management (in writing) what you think the risks are. These might include:
- risk of system collapse due to performance issues,
- risk of major security problems,
- risk that extensions of functionality will be severely delayed.
A “can do” attitude is not going to help if the reality says that you can’t do it. Management needs to be fully aware of the risks … even if they choose to ignore them.
Apart from that … good luck!
Another solution is to walk away and leave the headache to someone else.
I’d be considering that too.
I’m not a proponent of re-write, be it a slow rewrite where you create a new project and slowly turn off functionality in the other app or a big bang rewrite.
I do like refactoring. Start reducing the number of interdependencies the app has (so that a page depends on say 24 instead of 56 external classes), fix the easy stuff like code formating, do the easy extractions like a data tier (move SQL executions to code external to what is calling it), find dead code and archive it. In the process things will break, so as you go along you will want to look for opportunities to set up integration tests. I tend to find that beautiful unit tests don’t become possible until fairly late in the maintenance cycle.
Another thing I’ve started doing recently for this type of code is extensive logging– be careful that your logging doesn’t interfer with production perf and behavior though. With code bases like this, a big challenge do moving it forward or rewriting it in any sense of the word is that it is hard to figure out what it is doing. With logging you can figure out what the true sequence of events is, what code is really dead, what code is the “root” that gets called over and over again, where the entry points are etc. Also, once you discover where the roots and hotspots are, you know what areas will cause the most pain should they break.
1