Link: SQL

Select, From, Where

  • FROM specify the table
  • SELECT apply to columns.
    • My note: SELECT is different from Select in R. It does not mean selecting column xyz and only xyz will be loaded. In SQL it means xyz are the columns that would appear in the final dataset! In addition, all columns included in SELECT must be passed through Group by, Having, Count clause if it exists.
  • WHERE apply to rows (filter), see more in SQL filter: WHERE
  • WHERE...LIKE: filter. We can also use wildcard % for any number of characters.
query = """
        SELECT Name, Type
        FROM `bigquery-public-data.pet_records.pet`
        WHERE Animal = "Cat"
        """
 
-- Example for WHERE LIKE
WHERE Name LIKE "Ripley"
WHERE Name like '%ipl%'
 
-- Select all columns
SELECT * FROM table1
 
-- Select as alias
SELECT `by` AS author FROM table1

Alias for SELECT … FROM

We can use alias for selected table too:

SELECT
  g.book_title
FROM goodreads AS g;

SELECT DISTINCT

Often combine with SELECT: SELECT DISTINCT

DISTINCT multiple columns

If two ore more columns in SELECT DISTINCT: use comma to separate

SELECT DISTINCT user_id, status FROM trades