Remix.run Logo
wcarss 17 hours ago

> So, no. There is no development configuration in production, or mirroring of a point of sales terminal to another system that's running development code.

This is a misreading of the suggestion, I think. My reading of the suggestion is to run a production "dry run" parallel code path, which you can reconcile with the existing system's work for a period of time, before you cut over.

This is not an issue precluded by PCI; it is exactly the method a team I led used to verify a rewrite of and migration to a "new system" handling over a billion dollars of recurring billing transactions annually: write the new thing with all your normal testing etc, then deploy it alongside in a "just tell us what you would do" mode, then verify its operation for specific case classes and then roll progressively over to using it for real.

edit: I don't mean to suggest this is a trivial thing to do, especially in the context you mentioned with many elements of hardware and likely odd deployment of updates, etc.

shagie 14 hours ago | parent [-]

Our reading of PCI DSS was that there was no development code in a production build. Having a --dry-run flag would have meant doing that.

You could do "here is the list of skus for transaction 12120112340112345 - run this through the system and see what you get" on our dev boxes hooked up to QA store 2 (and an old device in the lab hooked up to QA store 1). That's not a problem.

Sending the scanner reads to the current production and a dev box in production would have been a hardware challenge. Not completely insurmountable but very difficult.

Sending the keyboard entry to both devices would be a problem. The screens were different and you can hand enter credit card numbers. So keyboard entry is potentially PCI data.

The backend store server would also have been difficult. There were updates to the store server (QA store 1 vs QA store 2 running simultaneously) that were needed too.

This wasn't something that we could progressively roll out to a store. When a store was to get the new terminals, they got a new hardware box, ingenicos were swapped with epson, old epson were replaced with new (same device but the screens had to be changed to match a different workflow - they were reprogrammable, but that was something that stores didn't have the setup to do), and a new build was pushed to the store server. You couldn't run register 1 with the old device and register 2 with a new one.

Fetching a list of SKUs, printing up a page of barcodes and running it was something we could do (and did) in the office. Trying to run a new POS system in a non-production mode next to production and mirroring it (with reconciling end of day runs) wasn't feasible for hardware, software, and PCI reasons that were exacerbated by the hardware and software issues.

Online this is potentially easier to do with sending a shopping cart to two different price calculators and logging if the new one matches the old one. With a POS terminal, this would be more akin to hooking the same keyboard and mouse up to a windows machine and a linux machine. The Windows machine is running MS Word and the linux is running Open office and checking to see that after five minutes of use of the windows machine that the Linux machine had the same text entered into OpenOffice. Of course they aren't - the keyboard entry commands are different, the windows are different sizes, the menus have things in different places in different drop downs... similarly, trying to do this with the two POS systems would be a challenge. And to top it off sometimes the digits typed are hand keyed credit card numbers when the MSR couldn't get a read - and make sure those don't show up on the linux machine.

I do realize this is reminiscent of business giving a poorly spec'ed thing and each time someone says "what about..." they come up with another reason it wouldn't work. This was a system that I worked on for a long while (a decade and a half ago) and could spend hours drawing and explaining diagrams of system architecture and issues that we had. Anecdotes of how something worked in a 4M Sloc system are inherently incomplete.

wcarss 12 hours ago | parent [-]

Neat! Yeah, that's a pretty complex context and I completely see what you mean about the new hardware being part of the rollout and necessarily meaning that you can't just run both systems. My comment is more of a strategy for just a backend or online processing system change than a physical brick and mortar swap out.

In my note about misreading the suggestion, I was thinking generally. I do believe that there is no reason from a PCI perspective why a given production system cannot process a transaction live and also in a dry mode on a new code path that's being verified, but if the difference isn't just code paths on a device, and instead involves hardware and process changes, your point about needing to deploy a dev box and that being a PCI issue totally makes sense, plus the bit about it being a bad test anyway because of the differences in actions taken or outputs.

The example you gave originally, of shipping to the lower stake exceptional stores first and then working out issues with them before you tried to scale out to everywhere, sounded to me like a very solid approach to mitigating risk while shipping early.

shagie 11 hours ago | parent [-]

More of the background of the project.

The original register was a custom written C program running in DOS. It was getting harder and harder to find C programmers. The consultancy that had part of the maintenance contract with it was also having that difficulty and between raising the rates and deprioritizing the work items because their senior people (the ones who still knew how to sling C and fit it into computers with 4 MB of memory that you couldn't get replacement parts for anymore) were on other (higher paying) contracts.

So the company I worked at made the decision to switch from that program to a new one. They bought and licensed the full source to a Java POS system (and I've seen the same interface at other big retail companies too) and replace all the hardware in all the stores... ballpark 5000 POS systems.

The professional services consultancy was originally brought in (I recall it being underway when I started at there in 2010). They missed deadlines and updates and I'm sure legal got in there with failure to deliver on contract. I think it was late 2011 that the company pulled the top devs from each team and set us to working on making this ready in all stores by October 2012 (side note: tossing two senior devs from four different teams into a new team results in some challenging personality situations). And that's when we (the devs) flipped the schedule around and instead of March 2013 for the cafeteria and surplus store (because they were the odd ones), we were going to get them in place in March of 2012 so that we could have low risk production environments while we worked out issues (so many race conditions and graphical event issues hanging with old school AWT).

---

... personality clash memory... it was on some point of architecture and code and our voices were getting louder. Bullpen work environment, (a bunch of unsaid backstory here) but the director was in the cube on the other side of the bullpen from us. The director "suggested" that we take our discussion to a meeting room... so we packed up a computer (we needed it to talk about code), all of the POS devices that we needed, put it on a cart, pushed the cart down the hall into a free conference room (there were two conference rooms on that floor - no, this wasn't a building designed for development teams) and set up and went back to loudly discussing. However, we didn't schedule or reserve the room... and the director that kicked us out of the bullpen had reserved the room that we had been kicked into shortly after we got there. "We're still discussing the topic, that will probably be another 5-10 minutes from now... and it will take us another 5 minutes pack the computer back up and take it back to the bullpen. Your cube with extra chairs in it should be available for your meeting and it's quiet there now without our discussions going on."