Remix.run Logo
garciasn 15 hours ago

Experienced Data Scientists and/or those straight out of school are EXTREMELY lacking in valuable SQL experience and always have been. Take a DS with 25 years experience in SAS, many of them are great with DATAstep, but have far less experience using PROC SQL for querying the data in the most effective way--even if they were pulling the data down with pass-through via SAS/ACCESS.

Often they'd be doing very simplistic querying and then manipulating via DATAstep prior to running whatever modeling and/or reporting PROCs later, rather than pushing it upstream into a far faster native database SQL pull via pass-through.

Back in 2008/2009, I saved 30h+ runtime on a regular report by refactoring everything in SQL via pass-through as opposed to the data scientists' original code that simply pulled the data down from the external source and manipulated it in DATAstep. Moving from 30h to 3m (Oracle backend) freed up an entire FTE to do more than babysit a long-running job 3x a week to multiple times per day.