If employers judge you by your skills, then adding even one skill — like SQL — can make a big difference in your paycheck.
SQL is helpful for more than just data scientists. Labor market research by Burning Glass shows that two-thirds of all jobs demanding analytics skills aren't classic data science jobs. Meaning if you're a Marketer, Product Manager, Researcher, or Supply Chain Manager, a dose of SQL savvy could boost your salary premium from $8,000 to $33,000 per year. Not too shabby.
To marketers, code looks scary. But just looking at queries others have written can help you to absorb SQL syntax in the context of a real world question — without the added pressure of having to write it yourself.
Ask your engineers for read access to your production database. Once you get login permissions, whether to a platform like Periscope Data or Postgres database via command line, you can run a saved query and watch the results that get returned. And if you’ve ever used Excel, you'll see that a database table is structured a lot like a spreadsheet — with numbered rows, titled columns, and the ability to be created, queried, updated, and deleted.
Example 1 - Connecting to a Database
In order to query a database, you'll need to connect to one first. If you're using Fusion, we have step-by-step instructions on how to do so for a MySQL data source.
Of the range of BI tools you can use, each has its own merits depending on your use case. That said, among the top tools on the market are Tableau, Looker, MySQL Workbench, Yellowfin BI, Amazon Quicksight, Domo, Metabase, and Qlik.
Once you've connected to your BI tool of choice, you should start by understanding the schema of your database by way of a simple SHOW TABLES; command.
This query you would write in the query editor, or console. Once you hit Run, your database should return the name of every single table in your database.
But what if you want to return all the rows from a certain table? With SELECT * you can return all the rows FROM a table you specify, like us_housing_units:
To return just the column names from a certain table, you can either use DESCRIBE or SHOW COLUMNS. So if I want all columns returned from the fused_contact table, I could do this by writing:
SHOW COLUMNS FROM fused_contact;
Either of these SQL queries will return fields like first_name, email, lead_source_detail, and more. If that seems confusing, here's a diagram that shows how the data gets processed. First the warehouse, hosted on Amazon RDS, looks for your MySQL warehouse. Then the query searches the warehouse tables based on your query, returning all columns:
Example 2 - Auto-Generated & Saved Queries
Some tools like Chartio or Yellowfin BI will auto-generate SQL for you based on the measures (numerical values that mathematical functions work on) and dimensions (qualitative values like sales region, employee, location, or dates) you specify. These auto-generated queries can be a great learning opportunity for learning SQL, should you take the time to study them. Take this auto-generate SQL query in Chartio, for example:
Even if I don't know any SQL at all, I can see that I'm getting data from the "User Rollup" field in a Users table. I'm also trying to return measures like "Cost" and "Count of distinct Activity" by "User Id". So I don't have to be a SQL wizard to begin to piece together how the actual measures and tables relate to a query's particular syntax.
Likewise, if you have a data analyst who writes a lot of queries, odds are he or she saves them somewhere for later use. These saved queries are also a great learning opportunity because you don't have to do any work, per se.
In the Klipfolio query below, you can see we've connected to a MySQL database and the query editor already has a saved SQL statement we can run.
So before you even get into the nitty gritty of SQL, try auto-generated SQL or running a saved SQL query. Then ask why the SQL returned the rows or data visualization it did. And if you're feeling so bold, you might even try using the command line.
Example 3 - Cohort Analysis
Okay, now let's imagine our colleague has taken the time to write the following statement and has given it to me to visualize in Metabase.
When I run the query, Metabase returns a bar chart which shows all my lead sources — contact us form, chat, webinar, ebook, and trial — on the x-axis.
Now let's break it down:
1. SELECT says "give us data" (rather than creating new data or updating or deleting existing data).
2. As in Excel, COUNT in SQL tallies the lead source detail field. So we're measuring the total count of each lead source.
3. To fetch these lead source counts, we need to tell our database where to look. That's what FROM accomplishes. It says, give me data from a table, in this case the fused_contact table.
4. Now we narrow down the results with two conditions: Interest in Fusion (WHERE fusion_interest = 1) and lead source detail fields that return a value, or isn't empty (<> 'null').
5. Tidying up, we group our results (GROUP BY) according to the lead source detail field names (contact us, chat, webinar, ebook, trial).
In just nine lines of code, which practically reads like English, we return a chart that gives us real marketing insights into where our lead sources come from, and how they rank against one another.
Example 4 - Lifecycle Analysis
Lifecycle stages help you organize your contacts based on the stage they are at in your sales cycle: subscriber, lead, marketing qualified lead, sales qualified lead, opportunity, customer, or evangelist.
If you use Fusion, segmenting your contacts by lifecycle stage is relatively easy with the fused_contact table. You can simply SELECT lifecycle_stage and swap or add other column names as you wish. Consider groupings by a picklist such as country, state, status, lead_score, or hubspot_score.
GROUP BY lifecycle_stage ASC;
You can make the output look even nicer by defining clearer names to variables and ordering them in a certain manner. For example:
SELECT lifecycle_stage AS `LCS`
COUNT(lifecycle_stage) AS `Count LCS`
GROUP BY lifecycle_stage
ORDER BY COUNT(lifecycle_stage) DESC;
If you feel like segmenting your opportunities, simply swap out the table fused_contact for fused_opportunity, the latter of which contains fields like is_won, is_closed, close_date, lead_source, amount, deal_stage.
deal_stage, close_date, amount, external_sales_person_id
fiscal_quarter = '1';
To see more examples, we've marshaled some great starter SQL queries for you here.
Example 5 - Joining Tables
OK, now let's look at another example, this time with joins, which can function a lot like returning one segment of a Venn diagram depending on whether they are an inner, outer, left, or right join.
In fact, some BI tools actually visualize joins this way. In Tableau you can join two tables without any SQL whatsoever. All you need to do is specify your tables. Here we have the fused_opportunity and fused_contact table, which we're joining via the links_contact_opportunity table.
Start with two tables, fused_opportunity and links_contact_opportunity, first:
Add a third table, fused_contact. In essence, you will JOIN tables ON Contact Id, using an inner join:
Then be sure to specify the kinds of joins you want between the tables. Again, in the majority of use cases, you'll want an inner join and need to join on a primary key, like Contact Id.
But what does a join look like as an actual SQL query?
Have a look at the example below and we'll break it down one line at a time.
1. As always, we use SELECT to get data from the database, specifying the columns we'll return — Id, Name, Product_Name, Quantity, Cost, and MRR — and from which table —Subscriptions.
2. Now we join tables. A JOIN clause combines rows from two or more tables, based on a related column between them. Here we have two joins, one that joins the Accounts and Subscriptions tables, another the Products and Subscriptions tables.
Both specify the column on which they'll join the tables, AccountId and ProductId respectively:
Accounts ON Subscriptions.AccountId
Products ON Subscriptions.ProductId
Notice how the dot notation combines the names of a table (Subscriptions) and columns (e.g. AccountId or ProductId) to let the database know to which columns and tables I’m referring.
3. Like our previous example, we then narrow down the results, saying, "Only return rows where the MRR is greater than 49.99 and the product name is Klipfolio".
4. Lastly, we order the results by account name.
Example 6 - Creating SQL Views
Bedrock Data recently announced the ability to create SQL views, also known as "user views" and "database views".
Views are virtual tables that define a table and accelerate your analysis by saving searchable object in a database which you've defined by a query. Though a view doesn't store data, per se, you can query a view as you would a table.
So if you're using Fusion, which supports SQL database views, you can save columns and rows results for future use. So if you want to remember the contacts with whom you’ve spoken in the last two weeks in the west coast territory, you only need query once, save the table as a user view, and access it anywhere, anytime.
To create a view, you'd use the CREATE VIEW statement, which creates the table (view_name), which you can then make an alias for using the AS clause.SQL aliases are used to give a table, or a column in a table, a temporary name. Aliases are often used to make column names more readable, but they can also be used to save the view you create. For example:
CREATE VIEW view_name AS
After this introductory clause, you'd then specify which columns, tables, and conditions you want returned in your view.
CREATE VIEW view_name AS alias_name
SELECT column1, column2, ...
Views are great to perform calculations so you don't have to apply a formula over and over again. For instance:
CREATE VIEW total_sales AS
SELECT orderNumber, SUM(quantityOrdered * priceEach) total
WHERE created_at >= '2018-05-28' and created_at <= '2018-09-28'
GROUP BY orderNumber DESC;
Here we're creating the view total_sales by accessing the orderDetails table, specifically the orderNumber field.
Then we use the SUM() function to get our total purchase amount, bookending our date range as the summer of 2018 from the end of May to the end of September.
Finally, we order our results in descending order with DESC.
Saving this user view lets us see our total sales a whole lot faster. And if we save this query as a visualization in our BI tool of choice, we can then watch the visualization update on the fly as the summer moves along. Since our Fusion MySQL database updates every 30 minutes, we'll also get the latest and greatest data from our warehouse.
And that, my friends, is your SQL primer for marketers. If you have any questions or comments, post them below. Once again, for free SQL queries that can get you started in Fusion, just click the link to download:
Want to create a Fusion account? You can get started for free right here: