As I am pretty sure was said, loudly and repeatedly in the movie Jerry Maguire: “Show Me The Data!”
It is the Data Scientists Mantra...
It is the Data Scientists Mantra...
Data is in our name and really you need data to do your work, and we often have a singular place to store and find all our data (a data warehouse) for the work we do. But of course, the better the tools you have to easily group it, subset it, order it and otherwise transform it, the more insights you can reveal with the rest of your skills.
To command all the power that the data warehouse offers and that your analytical work deserves, having a solid grip on SQL (Structured Query Language) is key. It will make your life easier, happier, and simply more wonderful.
So, here are some tools to aid in that quest.
Books
First of all, I would like to introduce you to O’Reilly books, mostly because of 2 titles I have found repeatedly useful:
- SQL Pocket Guide – A Guide to SQL Usage, this slim volume helps with the details of syntax and all the details I can’t remember. Less than 200 pages with a good index. Current edition is the 3rd, and it’s getting a little dog-eared.
- SQL Cookbook – This provides a list of various tasks you’re likely to need to accomplish and for each, the ways to get it done. Examples of some of the task descriptions: Retrieving a Subset of Rows from a Table; Finding Rows That Satisfy Multiple Conditions; Concatenating Column Values; Transforming Nulls into Real Values; Searching for Patterns
O’Reilly offers a good deal if you want both the paper and eBook format of a title, and they are very well edited and geared towards answering questions.
Online tutorials
There are a number of online resources which offer training in a variety of formats (multimedia, written word, etc.) and the best also include support for actually working through problems online.
- Khan Academy offers audio/video training with exercises at: https://www.khanacademy.org/computing/computer-programmi ng/sql – These have the trademark depth and quality of all the Khan academy offerings along with exercises and grading and progress tracking.
- Tutorialspoint offers a written presentation along with diagrams and a coding ground to practice what you are learning. http://www.tutorialspoint.com/sql/index.htm
- This is more of a show it, do it tutorial arrangement, small bites of information with the chance to try out each thing presented in-line. http://sqlzoo.net/wiki/SQL_Tutorial
- A SQL tutorial, from a data scientists perspective for data scientists: http://downloads.bensresearch.com/SQL.pdf
Ephemera
These are some resources I came across that talk about the intersections of R and SQL
- SQL and R are allies: https://www.simple-talk.com/sql/reporting-services/makin g-data-analytics-simpler-sql-server-and-r/
- Reaching into R dataframes with SQL: http://www.burns-stat.com/translating-r-sql-basics/
- SQL Server integration with R: https://msdn.microsoft.com/en-us/library/mt604885.aspx
- There is a SQL Playground on-line where you can try out creating databases and using a range of SQL manipulations at http://sqlfiddle.com/. It’s got a lot in common with R Fiddle (at http://www.r-fiddle.org/#/)
Comments