Note: You may wish to read Part I, Part II, and Part III of this series to gain context.
One of the prevailing requirements for the Plus application was that it allow field staff to enter data offline and “sync” it over the Interwebs at their leisure. In addition, the application had to get updated schedule information and previously-synced data and show it to the user while offline. As you know, the most logical design that meets this requirement is a disconnected client-server model. Therefore, a server had to be built to accept connections from the client, process incoming data, and send back schedule information and previously-synced data.
This by itself isn’t such a big deal, right? Apache Tomcat exists just for tasks of this nature. It’s an entire server container that’s mature, been heavily tested, well-maintained, and trusted by enormous companies that could easily afford to choose any server product on Earth. It includes all sorts of advanced configuration options, logging facilities, exception handling, and standards adherence. There’s even a Windows installer package available! Installation, configuration, and basic Java implementation is fast and easy.
But since this was Jim’s project, it wasn’t fast, easy, simple, logical, standards-based, or maintainable. Jim rolled his own Tomcat server instead. Yes, that’s right – he ignored the ready-built server software available and wrote his own software to do the same thing. However, his version didn’t have the benefit of testing…or configuration options…or exception handling…or standards compliance…or maintainability…or – well, you get the idea. As a result, Jim’s Tomcat didn’t really work.
As anyone might imagine, the directory quickly grew and eventually consumed all disk space on the server, crashing it.
Jim split the server implementation into two completely separate areas of the server itself. The actual application itself was only responsible for receiving client requests for schedule information – but not for sending the data itself back to the client. Instead, the application just zipped the schedule data into a temp directory and sent the file path back to the client. The client then connected to an SFTP server and downloaded the specified file. When the client needed to download previously-submitted data files, it connected to the same SFTP server, navigated to a hard-coded directory, and downloaded everything in it. Yes, that’s right – each user ended up downloading all data ever submitted by any of the 150 users every time the client updated. As anyone might imagine, the directory quickly grew and eventually consumed all disk space on the server, crashing it.
The submission of data, however, was handled by the client. The client connected to that same SFTP server, navigated to a hard-coded directory, and uploaded individual files to it. Then a cron job ran every 60 seconds, swept the directory, determined the type of each data file by matching the beginning of its filename, loaded the data into a database, and moved the files to the “submitted data” folder that every client downloaded. If a file couldn’t be read, the cron job died and any file not already loaded was left in the folder. The next time the cron job ran the same problem would occur…without any notification to anyone.
I inherited all of this when I was stuck with the Plus project. Had I known what I was getting into, I would’ve said “You’ve gotta pay me more.”
Comments