Skip to main content

SQL Resources, or Show Me The Data!

As I am pretty sure was said, loudly and repeatedly in the movie Jerry Maguire: “Show Me The Data!”
It is the Data Scientists Mantra...


Data is in our name and really you need data to do your work, and we often have a singular place to store and find all our data (a data warehouse) for the work we do. But of course, the better the tools you have to easily group it, subset it, order it and otherwise transform it, the more insights you can reveal with the rest of your skills.


To command all the power that the data warehouse offers and that your analytical work deserves, having a solid grip on SQL (Structured Query Language) is key. It will make your life easier, happier, and simply more wonderful.


So, here are some tools to aid in that quest.

Books

First of all, I would like to introduce you to O’Reilly books, mostly because of 2 titles I have found repeatedly useful:


  • SQL Pocket Guide – A Guide to SQL Usage, this slim volume helps with the details of syntax and all the details I can’t remember. Less than 200 pages with a good index. Current edition is the 3rd, and it’s getting a little dog-eared.


  • SQL Cookbook – This provides a list of various tasks you’re likely to need to accomplish and for each, the ways to get it done. Examples of some of the task descriptions: Retrieving a Subset of Rows from a Table; Finding Rows That Satisfy Multiple Conditions; Concatenating Column Values; Transforming Nulls into Real Values; Searching for Patterns


O’Reilly offers a good deal if you want both the paper and eBook format of a title, and they are very well edited and geared towards answering questions.


Online tutorials

There are a number of online resources which offer training in a variety of formats (multimedia, written word, etc.) and the best also include support for actually working through problems online.
  • Khan Academy offers audio/video training with exercises at: https://www.khanacademy.org/computing/computer-programmi ng/sql – These have the trademark depth and quality of all the Khan academy offerings along with exercises and grading and progress tracking.


  • Tutorialspoint offers a written presentation along with diagrams and a coding ground  to practice what you are learning. http://www.tutorialspoint.com/sql/index.htm
  • This is more of a show it, do it tutorial arrangement, small bites of information with the chance to try out each thing presented in-line. http://sqlzoo.net/wiki/SQL_Tutorial
  • A SQL tutorial, from a data scientists perspective for data scientists: http://downloads.bensresearch.com/SQL.pdf

Ephemera

These are some resources I came across that talk about the intersections of R and SQL
  • SQL and R  are allies: https://www.simple-talk.com/sql/reporting-services/makin g-data-analytics-simpler-sql-server-and-r/
  • Reaching into R dataframes with SQL: http://www.burns-stat.com/translating-r-sql-basics/
  • There is a SQL Playground on-line where you can try out creating databases and using a range of SQL manipulations at http://sqlfiddle.com/. It’s got a lot in common with R Fiddle (at http://www.r-fiddle.org/#/)

Comments

Popular posts from this blog

Unit Testing - What to Test

This I wrote to answer a question that came up when we were discussing our software process and I was training developers on how to unit test. It seems a simple enough question, but I kept pondering it and delving deeper until I realized I needed to write this monograph. What unit tests should we write? How do we know what to test? Ideally, unit tests should cover every path through the code. It should be your chance to see every path through your code works as expected and as needed. If you are practicing Test Driven Development then it's implied everything gets a test. In the real world, you might not be allowed to test everything - for instance, if the testing suite ends up taking a week to run, then the world will have changed by the time it finishes and the test results will be obsolete. Unit testing at it's basic is testing an object, a method - the smallest unit of your code that it can test independently. It should test the inputs "goes into" an

Healthcare and Health Informatics Glossary

Here is a glossary of terms useful in Healthcare and Health Informatics ACO (Accountable Care Organization) MEDICARE’s outcomes-based contracting approach Arden Syntax an approach to specifying medical knowledge and clinical decision support rules in a form that is independent of any EHR and thus sharable across hospitals ARRA (American Recovery and Reconstruction Act) the Obama administration’s 2009 economic stimulus bill Blue Button an ASCII text based standard for heath information sharing first introduced by the Veteran’s Administration to facilitate access to records stored in VistA by their patients. The newer Blue Button + format provides both human and machine readable formats. CCD (Continuity of Care Document) an XML-based patient summary based on the CDA architecture CCOW (Clinical Context Object Workshop) an HL7 standard for synchronizing and coordinating applications to automatically follow the patient, user (and other) contexts to allow the clinical u

Files as UI

Files as UI vs API  -  compares attributes of iCloud vs Dropbox. It starts on an interesting note - the model of a file system in the UI is dying, and should be let go. Beyond that it looks at mappings of each system to a file system from an API point of view and compares the successes of each. I find the initial thread the most interesting. Drop the mental model of a file system - which maps virtual concepts of files and directories to a physical model of papers, folders and file cabinets - and replace it with...what? This is a paradigm shift for me. I have to admit, I loath, hate, nay, despise looking for things. If I can't find something easily, it's only about a minute before I start growling and muttering things my mother would disapprove of. On this basis, I like the idea that I can save myself from thinking about where to put things or, where I have already put them. But how do we do this? It's non-trivial, since humans think of "things" and once they