Skip to main content

SQL Resources, or Show Me The Data!

As I am pretty sure was said, loudly and repeatedly in the movie Jerry Maguire: “Show Me The Data!”
It is the Data Scientists Mantra...


Data is in our name and really you need data to do your work, and we often have a singular place to store and find all our data (a data warehouse) for the work we do. But of course, the better the tools you have to easily group it, subset it, order it and otherwise transform it, the more insights you can reveal with the rest of your skills.


To command all the power that the data warehouse offers and that your analytical work deserves, having a solid grip on SQL (Structured Query Language) is key. It will make your life easier, happier, and simply more wonderful.


So, here are some tools to aid in that quest.

Books

First of all, I would like to introduce you to O’Reilly books, mostly because of 2 titles I have found repeatedly useful:


  • SQL Pocket Guide – A Guide to SQL Usage, this slim volume helps with the details of syntax and all the details I can’t remember. Less than 200 pages with a good index. Current edition is the 3rd, and it’s getting a little dog-eared.


  • SQL Cookbook – This provides a list of various tasks you’re likely to need to accomplish and for each, the ways to get it done. Examples of some of the task descriptions: Retrieving a Subset of Rows from a Table; Finding Rows That Satisfy Multiple Conditions; Concatenating Column Values; Transforming Nulls into Real Values; Searching for Patterns


O’Reilly offers a good deal if you want both the paper and eBook format of a title, and they are very well edited and geared towards answering questions.


Online tutorials

There are a number of online resources which offer training in a variety of formats (multimedia, written word, etc.) and the best also include support for actually working through problems online.
  • Khan Academy offers audio/video training with exercises at: https://www.khanacademy.org/computing/computer-programmi ng/sql – These have the trademark depth and quality of all the Khan academy offerings along with exercises and grading and progress tracking.


  • Tutorialspoint offers a written presentation along with diagrams and a coding ground  to practice what you are learning. http://www.tutorialspoint.com/sql/index.htm
  • This is more of a show it, do it tutorial arrangement, small bites of information with the chance to try out each thing presented in-line. http://sqlzoo.net/wiki/SQL_Tutorial
  • A SQL tutorial, from a data scientists perspective for data scientists: http://downloads.bensresearch.com/SQL.pdf

Ephemera

These are some resources I came across that talk about the intersections of R and SQL
  • SQL and R  are allies: https://www.simple-talk.com/sql/reporting-services/makin g-data-analytics-simpler-sql-server-and-r/
  • Reaching into R dataframes with SQL: http://www.burns-stat.com/translating-r-sql-basics/
  • There is a SQL Playground on-line where you can try out creating databases and using a range of SQL manipulations at http://sqlfiddle.com/. It’s got a lot in common with R Fiddle (at http://www.r-fiddle.org/#/)

Comments

Popular posts from this blog

You don't really know who you're talking to online...

The following is a story that I think highlights the assumptions that get you into trouble online... https://www.proofpoint.com/us/blog/threat-insight/i-knew-you-were-trouble-ta456-targets-defense-contractor-alluring-social-media This is particularly scary since we found so much utility in online connections during the pandemic and out of necessity, started trusting more online. Please note the timeline for this breach - it was a long, slow process, a key factor in many 'cons'. "Build trust" is a key first step, once someone has identified you as a party. You think...you're convinced you know who your talking to, but if you don't triangulate the identity with some non-online, ideally in-person information, you shouldn't trust. And even if you do get what seems like real-life confirmations of identity, you must look at questioning motives, needs, and keeping danger at arms-length. Online includes email, texting (sms), application chatbots, voice communicati...

Threat Modeling Manifesto

Secure Your Code with Threat Modeling As a software developer, security should be a top priority. By proactively identifying and addressing potential vulnerabilities, you can significantly reduce the risk of breaches and data loss. What is Threat Modeling?   Threat modeling is a systematic approach to identifying, assessing, and mitigating security threats. It involves looking at your system from a hacker's perspective to uncover weaknesses and devise strategies to protect against attacks. See the  OWASP Cheat Sheet   Why is Threat Modeling Important? Proactive Security: By anticipating potential threats, you can take steps to prevent them. Risk Mitigation: Identify and address vulnerabilities before they can be exploited. Regulatory Compliance: Adhere to industry standards and regulations. Enhanced Security Posture: Strengthen your overall security posture. How to Get Started with Threat Modeling   The Threat Modeling Manifesto provides a valuable framewor...

Where threat modeling can shine - an example from the EU MDCG-2019

From the  EU  MDCG 2019-16 Guidance on Cybersecurity for medical devices, December 2019 , this is the guidance on foreseeable risks.  Medical device manufacturers should ensure that a medical device is designed and manufactured in a way that ensures that the risks associated with reasonably foreseeable environmental conditions are removed or minimised. This may include the infield monitoring of the software’s vulnerabilities and the possibility to perform a device update (outside the context of a field safety corrective action) through, for example delivering patches to ensure the continued security of the device. During the risk management process, the manufacturer should foresee or evaluate the potential exploitation of those vulnerabilities that may be a result of reasonably foreseeable misuse. This, however, may depend on the specific situation. For example, using an unsecured memory-stick to enter data into a medical IT system can be considered “reasonably foreseeabl...