本文共 42649 字,大约阅读时间需要 142 分钟。
sql基础
The pandas workflow is a common favorite among data analysts and data scientists. The workflow looks something like this:
熊猫工作流是数据分析师和数据科学家共同的最爱。 工作流程如下所示:
The pandas workflow works well when:
在以下情况下,pandas工作流程可以正常运行:
When the data changes frequently, requires shared access, doesn’t fit in memory, and security is critical, a database is a much better solution. A database is a data representation that lives on disk that can be queried, accessed, and updated without using much memory. We primarily interact with a database using a or DBMS for short.
当数据频繁更改,需要共享访问,不适合内存且安全性至关重要时, 数据库是一个更好的解决方案。 数据库是驻留在磁盘上的数据表示形式,可以在不占用大量内存的情况下对其进行查询,访问和更新。 我们主要使用或简称DBMS与数据库进行交互。
In the pandas workflow, we spend most of our time thinking about what functions and methods to use, where to store intermediate results in variables, and juggling all of these. To work with data stored in a database, we instead use a language called SQL (or structured query language). In SQL, we express each unique request (whether it be fetching a subset of or editing values in the data) as a single query and then ask the DBMS to run the query and display any results.
在熊猫工作流程中,我们将大部分时间都花在考虑使用什么功能和方法,将中间结果存储在变量中以及如何处理所有这些方面。 为了处理存储在数据库中的数据,我们改用一种称为SQL的语言(或结构化查询语言)。 在SQL中,我们将每个唯一请求(无论是获取数据的子集还是编辑数据中的值)都表示为单个查询,然后要求DBMS运行查询并显示任何结果。
For example, to fetch a specific subset of the data from a database, we would:
例如,要从数据库中获取数据的特定子集,我们将:
SELECT * FROM salaries
SELECT * FROM salaries
Here’s what the database workflow looks like:
数据库工作流程如下所示:
Because the data lives on disk, we can work with datasets that consume multiple terabytes of disk space. Many data science teams in industry have servers and setups in cloud environments like Microsoft Azure or Amazon Web Services that let team members work with this scale of data. Robust and popular DBMS tools like and include powerful features for managing user credentials, security, and high data throughput (quickly changing data). In this course and the next, we’ll learn the fundamentals of SQL using a small, portable DBMS called . SQLite is the most popular database in the world and is lightweight enough that the SQLite DBMS is included as a . In later courses, we’ll dive into production systems like Postgres.
由于数据保存在磁盘上,因此我们可以使用消耗多个TB磁盘空间的数据集。 行业中的许多数据科学团队都在Microsoft Azure或Amazon Web Services等云环境中拥有服务器和设置,使团队成员可以使用这种规模的数据。 强大且流行的DBMS工具(例如和包括用于管理用户凭据,安全性和高数据吞吐量(快速更改数据)的强大功能。 在本课程和下一节中,我们将使用称为的小型可移植DBMS学习SQL的基础知识。 SQLite是世界上最流行的数据库,它的重量很轻,足以将SQLite DBMS作为包含 。 在以后的课程中,我们将深入探讨Postgres等生产系统。
In this course, we’ll explore data from the American Community Survey on job outcome statistics based on college majors. While the original CSV version can be found on , we’ll be using a slightly modified version of the data that’s stored as a database. We’ll be working with a of the data that contains the 2010-2012 data for recent college grads only. In this post, we’ll learn how to write SQL queries to explore and start to understand the dataset.
在本课程中,我们将探索来自美国社区调查的数据,这些数据是基于大学专业的工作成果统计数据的。 虽然可以在上找到原始的CSV版本,但是我们将使用存储在数据库中的数据的稍微修改后的版本。 我们将使用其中仅包含最近大学毕业生的2010-2012年数据的数据。 在本文中,我们将学习如何编写SQL查询来探索和开始理解数据集 。
Whenever we encountered a new dataset in the past, we displayed the first few rows to get familiar with the different columns, types of values, and some sample data.
过去,只要遇到新的数据集,我们就会显示前几行,以熟悉不同的列,值的类型和一些示例数据。
We’ve loaded the dataset on job outcome statistics into a database. A database usually consists of multiple, related tables of data. Each table contains rows and columns, just like a CSV file. We’ll be working with the database file jobs.db
, which contains a single table named recent_grads
. In later courses, we’ll learn how to work with a database containing multiple tables.
我们已经将工作结果统计数据集加载到数据库中。 数据库通常由多个相关的数据表组成。 每个表都包含行和列,就像CSV文件一样。 我们将使用数据库文件jobs.db
,其中包含一个名为recent_grads
表。 在以后的课程中,我们将学习如何使用包含多个表的数据库。
To display the first 5 rows from the recent_grads
table, we need to:
要显示recent_grads
表的前5行,我们需要:
Like other programming languages, code in SQL has to adhere to a defined structure and vocabulary. To specify that we want to return the first 5 rows from recent_grads
, we need to run the following SQL query:
像其他编程语言一样,SQL中的代码必须遵守定义的结构和词汇表。 要指定我们要从recent_grads
返回前5行,我们需要运行以下SQL查询:
SELECT SELECT * * FROM FROM recent_grads recent_grads LIMIT LIMIT 55
index | 指数 | Rank | 秩 | Major_code | 专业代码 | Major | 重大的 | Major_category | 专业类别 | Total | 总 | Sample_size | 样本大小 | Men | 男装 | Women | 女装 | ShareWomen | 分享女性 | Employed | 受雇 | Full_time | 全职 | Part_time | 兼职 | Full_time_year_round | Full_time_year_round | Unemployed | 待业 | Unemployment_rate | 失业率 | Median | 中位数 | P25th | P25th | P75th | P75th | College_jobs | 大学工作 | Non_college_jobs | 非大学工作 | Low_wage_jobs | 低薪工作 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 1个 | 2419 | 2419 | PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 2339 | 2339 | 36 | 36 | 2057 | 2057年 | 282 | 282 | 0.120564 | 0.120564 | 1976 | 1976年 | 1849 | 1849年 | 270 | 270 | 1207 | 1207 | 37 | 37 | 0.018381 | 0.018381 | 110000 | 110000 | 95000 | 95000 | 125000 | 125000 | 1534 | 1534 | 364 | 364 | 193 | 193 |
1 | 1个 | 2 | 2 | 2416 | 2416 | MINING AND MINERAL ENGINEERING | 采矿与矿物工程 | Engineering | 工程 | 756 | 756 | 7 | 7 | 679 | 679 | 77 | 77 | 0.101852 | 0.101852 | 640 | 640 | 556 | 556 | 170 | 170 | 388 | 388 | 85 | 85 | 0.117241 | 0.117241 | 75000 | 75000 | 55000 | 55000 | 90000 | 90000 | 350 | 350 | 257 | 257 | 50 | 50 |
2 | 2 | 3 | 3 | 2415 | 2415 | METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 856 | 856 | 3 | 3 | 725 | 725 | 131 | 131 | 0.153037 | 0.153037 | 648 | 648 | 558 | 558 | 133 | 133 | 340 | 340 | 16 | 16 | 0.024096 | 0.024096 | 73000 | 73000 | 50000 | 50000 | 105000 | 105000 | 456 | 456 | 176 | 176 | 0 | 0 |
3 | 3 | 4 | 4 | 2417 | 2417 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 1258 | 1258 | 16 | 16 | 1123 | 1123 | 135 | 135 | 0.107313 | 0.107313 | 758 | 758 | 1069 | 1069 | 150 | 150 | 692 | 692 | 40 | 40 | 0.050125 | 0.050125 | 70000 | 70000 | 43000 | 43000 | 80000 | 80000 | 529 | 529 | 102 | 102 | 0 | 0 |
4 | 4 | 5 | 5 | 2405 | 2405 | CHEMICAL ENGINEERING | 化学工程 | Engineering | 工程 | 32260 | 32260 | 289 | 289 | 21239 | 21239 | 11021 | 11021 | 0.341631 | 0.341631 | 25694 | 25694 | 23170 | 23170 | 5180 | 5180 | 16697 | 16697 | 1672 | 1672 | 0.061098 | 0.061098 | 65000 | 65000 | 50000 | 50000 | 75000 | 75000 | 18314 | 18314 | 4440 | 4440 | 972 | 972 |
In this query, we specified:
在此查询中,我们指定:
SELECT *
FROM recent_grads
LIMIT 5
SELECT *
的列SELECT *
FROM recent_grads
查询的表 LIMIT 5
的行数 Here’s a visual breakdown of the different components of the query:
这是查询的不同组成部分的直观细分:
Writing and running SQL queries in our interface is similar to writing and running Python code. Type the query in the code cell and click Run to execute the query against the database. If you write multiple queries in a code cell, SQLite will only display the last query’s results.
在我们的界面中编写和运行SQL查询类似于编写和运行Python代码。 在代码单元格中键入查询,然后单击“运行”以对数据库执行查询。 如果您在一个代码单元中编写了多个查询,则SQLite将仅显示最后一个查询的结果 。
Let’s write a SQL query that returns the first 10
rows from recent_grads
.
让我们编写一个SQL查询,该查询返回recent_grads
的前10
行。
index | 指数 | Rank | 秩 | Major_code | 专业代码 | Major | 重大的 | Major_category | 专业类别 | Total | 总 | Sample_size | 样本大小 | Men | 男装 | Women | 女装 | ShareWomen | 分享女性 | Employed | 受雇 | Full_time | 全职 | Part_time | 兼职 | Full_time_year_round | Full_time_year_round | Unemployed | 待业 | Unemployment_rate | 失业率 | Median | 中位数 | P25th | P25th | P75th | P75th | College_jobs | 大学工作 | Non_college_jobs | 非大学工作 | Low_wage_jobs | 低薪工作 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 1个 | 2419 | 2419 | PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 2339 | 2339 | 36 | 36 | 2057 | 2057年 | 282 | 282 | 0.120564 | 0.120564 | 1976 | 1976年 | 1849 | 1849年 | 270 | 270 | 1207 | 1207 | 37 | 37 | 0.018381 | 0.018381 | 110000 | 110000 | 95000 | 95000 | 125000 | 125000 | 1534 | 1534 | 364 | 364 | 193 | 193 |
1 | 1个 | 2 | 2 | 2416 | 2416 | MINING AND MINERAL ENGINEERING | 采矿与矿物工程 | Engineering | 工程 | 756 | 756 | 7 | 7 | 679 | 679 | 77 | 77 | 0.101852 | 0.101852 | 640 | 640 | 556 | 556 | 170 | 170 | 388 | 388 | 85 | 85 | 0.117241 | 0.117241 | 75000 | 75000 | 55000 | 55000 | 90000 | 90000 | 350 | 350 | 257 | 257 | 50 | 50 |
2 | 2 | 3 | 3 | 2415 | 2415 | METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 856 | 856 | 3 | 3 | 725 | 725 | 131 | 131 | 0.153037 | 0.153037 | 648 | 648 | 558 | 558 | 133 | 133 | 340 | 340 | 16 | 16 | 0.024096 | 0.024096 | 73000 | 73000 | 50000 | 50000 | 105000 | 105000 | 456 | 456 | 176 | 176 | 0 | 0 |
3 | 3 | 4 | 4 | 2417 | 2417 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 1258 | 1258 | 16 | 16 | 1123 | 1123 | 135 | 135 | 0.107313 | 0.107313 | 758 | 758 | 1069 | 1069 | 150 | 150 | 692 | 692 | 40 | 40 | 0.050125 | 0.050125 | 70000 | 70000 | 43000 | 43000 | 80000 | 80000 | 529 | 529 | 102 | 102 | 0 | 0 |
4 | 4 | 5 | 5 | 2405 | 2405 | CHEMICAL ENGINEERING | 化学工程 | Engineering | 工程 | 32260 | 32260 | 289 | 289 | 21239 | 21239 | 11021 | 11021 | 0.341631 | 0.341631 | 25694 | 25694 | 23170 | 23170 | 5180 | 5180 | 16697 | 16697 | 1672 | 1672 | 0.061098 | 0.061098 | 65000 | 65000 | 50000 | 50000 | 75000 | 75000 | 18314 | 18314 | 4440 | 4440 | 972 | 972 |
5 | 5 | 6 | 6 | 2418 | 2418 | NUCLEAR ENGINEERING | 核工程 | Engineering | 工程 | 2573 | 2573 | 17 | 17 | 2200 | 2200 | 373 | 373 | 0.144967 | 0.144967 | 1857 | 1857 | 2038 | 2038年 | 264 | 264 | 1449 | 1449 | 400 | 400 | 0.177226 | 0.177226 | 65000 | 65000 | 50000 | 50000 | 102000 | 102000 | 1142 | 1142 | 657 | 657 | 244 | 244 |
6 | 6 | 7 | 7 | 6202 | 6202 | ACTUARIAL SCIENCE | 精密科学 | Business | 商业 | 3777 | 3777 | 51 | 51 | 832 | 832 | 960 | 960 | 0.535714 | 0.535714 | 2912 | 2912 | 2924 | 2924 | 296 | 296 | 2482 | 2482 | 308 | 308 | 0.095652 | 0.095652 | 62000 | 62000 | 53000 | 53000 | 72000 | 72000 | 1768 | 1768 | 314 | 314 | 259 | 259 |
7 | 7 | 8 | 8 | 5001 | 5001 | ASTRONOMY AND ASTROPHYSICS | 天文学与天体物理学 | Physical Sciences | 物理科学 | 1792 | 1792 | 10 | 10 | 2110 | 2110 | 1667 | 1667 | 0.441356 | 0.441356 | 1526 | 1526 | 1085 | 1085 | 553 | 553 | 827 | 827 | 33 | 33 | 0.021167 | 0.021167 | 62000 | 62000 | 31500 | 31500 | 109000 | 109000 | 972 | 972 | 500 | 500 | 220 | 220 |
8 | 8 | 9 | 9 | 2414 | 2414 | MECHANICAL ENGINEERING | 机械工业 | Engineering | 工程 | 91227 | 91227 | 1029 | 1029 | 12953 | 12953 | 2105 | 2105 | 0.139793 | 0.139793 | 76442 | 76442 | 71298 | 71298 | 13101 | 13101 | 54639 | 54639 | 4650 | 4650 | 0.057342 | 0.057342 | 60000 | 60000 | 48000 | 48000 | 70000 | 70000 | 52844 | 52844 | 16384 | 16384 | 3253 | 3253 |
9 | 9 | 10 | 10 | 2408 | 2408 | ELECTRICAL ENGINEERING | 电机工程 | Engineering | 工程 | 81527 | 81527 | 631 | 631 | 8407 | 8407 | 6548 | 6548 | 0.437847 | 0.437847 | 61928 | 61928 | 55450 | 55450 | 12695 | 12695 | 41413 | 41413 | 3895 | 3895 | 0.059174 | 0.059174 | 60000 | 60000 | 45000 | 45000 | 72000 | 72000 | 45829 | 45829 | 10874 | 10874 | 3170 | 3170 |
SQLite ran our query and returned the first 10 rows and all columns from the recent_grads
table. Head to the and spend some time getting familiar with what each column represents.
SQLite运行我们的查询,并返回了recent_grads
表的前10行和所有列。 转到 , 花一些时间熟悉每一列所代表的内容。
Based on this dataset preview and an understanding of what each column represents, here are some questions we may have:
基于此数据集预览以及对每列表示什么的理解,以下是我们可能遇到的一些问题:
Let’s start by focusing on the first question. The SQL workflow revolves around translating the question we want to answer to the subset of data we want from the database. To determine which majors had mostly female students, we want the following subset:
让我们从关注第一个问题开始。 SQL工作流围绕着将我们想要回答的问题转换为我们想要从数据库中获取的数据子集。 为了确定哪个专业主要是女学生,我们需要以下子集:
Major
columnShareWomen
is greater than 0.5
(corresponding to 50%)Major
栏 ShareWomen
大于0.5
(相当于50%)的行 To return only the Major
column, we need to add the specific column name in the SELECT
statement part of the query (instead of using the *
operator to return all columns):
要仅返回Major
列,我们需要在查询的SELECT
语句部分中添加特定的列名(而不是使用*
运算符返回所有列):
SELECT Major FROM recent_gradsSELECT Major FROM recent_grads
This will return all of the values in the Major
column. We can specify multiple columns this way as well and the results table will preserve the order of the columns:
这将返回“ Major
列中的所有值。 我们也可以通过这种方式指定多个列,结果表将保留列的顺序:
To return only the values where ShareWomen
is greater than or equal to 0.5
, we need to add a WHERE
clause:
要仅返回ShareWomen
大于或等于0.5
,我们需要添加WHERE
子句:
SELECT Major FROM recent_gradsWHERE ShareWomen >= 0.5SELECT Major FROM recent_gradsWHERE ShareWomen >= 0.5
Finally, we can limit the number of rows returned using LIMIT
:
最后,我们可以使用LIMIT
返回的行数:
Major | 重大的 |
---|---|
ACTUARIAL SCIENCE | 精密科学 |
COMPUTER SCIENCE | 计算机科学 |
ENVIRONMENTAL ENGINEERING | 环境工程 |
NURSING | 护理 |
INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 |
Here’s a breakdown of the different components:
以下是不同组件的细分:
While in the SELECT
part of the query, we express the specific column we want, in the WHERE
part we express the specific rows we want. The beauty of SQL is that these can be independent.
在查询的SELECT
部分中,我们表示所需的特定列,在WHERE
部分中,我们表示所需的特定行。 SQL的优点在于它们可以是独立的。
Let’s write a SQL query that returns the majors where females were a minority. We’ll Only return the Major
and ShareWomen
columns (in that order) and don’t limit the number of rows returned.
让我们编写一个SQL查询,返回女性为少数的专业。 我们将仅ShareWomen
顺序返回Major
和ShareWomen
列,并且不限制返回的行数。
SELECT SELECT MajorMajor , , ShareWomen ShareWomen FROM FROM recent_grads recent_grads WHERE WHERE ShareWomen ShareWomen < < 0.50.5
Major | 重大的 | ShareWomen | 分享女性 |
---|---|---|---|
PETROLEUM ENGINEERING | 石油工程师 | 0.120564 | 0.120564 |
MINING AND MINERAL ENGINEERING | 采矿与矿物工程 | 0.101852 | 0.101852 |
METALLURGICAL ENGINEERING | 冶金工程 | 0.153037 | 0.153037 |
NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | 0.107313 | 0.107313 |
CHEMICAL ENGINEERING | 化学工程 | 0.341631 | 0.341631 |
NUCLEAR ENGINEERING | 核工程 | 0.144967 | 0.144967 |
ASTRONOMY AND ASTROPHYSICS | 天文学与天体物理学 | 0.441356 | 0.441356 |
MECHANICAL ENGINEERING | 机械工业 | 0.139793 | 0.139793 |
ELECTRICAL ENGINEERING | 电机工程 | 0.437847 | 0.437847 |
COMPUTER ENGINEERING | 计算机工程 | 0.199413 | 0.199413 |
AEROSPACE ENGINEERING | 航空航天工程 | 0.196450 | 0.196450 |
BIOMEDICAL ENGINEERING | 生物医学工程 | 0.119559 | 0.119559 |
MATERIALS SCIENCE | 材料科学 | 0.310820 | 0.310820 |
ENGINEERING MECHANICS PHYSICS AND SCIENCE | 工程力学物理与科学 | 0.183985 | 0.183985 |
BIOLOGICAL ENGINEERING | 生物工程 | 0.320784 | 0.320784 |
INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | 0.343473 | 0.343473 |
GENERAL ENGINEERING | 一般工程 | 0.252960 | 0.252960 |
ARCHITECTURAL ENGINEERING | 建筑工程 | 0.350442 | 0.350442 |
COURT REPORTING | 法院报告 | 0.236063 | 0.236063 |
FOOD SCIENCE | 食物科学 | 0.222695 | 0.222695 |
ELECTRICAL ENGINEERING TECHNOLOGY | 电气工程技术 | 0.325092 | 0.325092 |
MATERIALS ENGINEERING AND MATERIALS SCIENCE | 材料工程与材料科学 | 0.292607 | 0.292607 |
MANAGEMENT INFORMATION SYSTEMS AND STATISTICS | 管理信息系统与统计 | 0.278790 | 0.278790 |
CIVIL ENGINEERING | 土木工程 | 0.227118 | 0.227118 |
CONSTRUCTION SERVICES | 建筑服务 | 0.342229 | 0.342229 |
OPERATIONS LOGISTICS AND E-COMMERCE | 运营物流与电子商务 | 0.322222 | 0.322222 |
MISCELLANEOUS ENGINEERING | 杂项工程 | 0.189970 | 0.189970 |
PUBLIC POLICY | 公共政策 | 0.251389 | 0.251389 |
ENGINEERING TECHNOLOGIES | 工程技术 | 0.090713 | 0.090713 |
MISCELLANEOUS FINE ARTS | 其他美术 | 0.410180 | 0.410180 |
GEOLOGICAL AND GEOPHYSICAL ENGINEERING | 地质与地球工程 | 0.324838 | 0.324838 |
FINANCE | 金融 | 0.355469 | 0.355469 |
ECONOMICS | 经济学 | 0.340825 | 0.340825 |
BUSINESS ECONOMICS | 商业经济学 | 0.249190 | 0.249190 |
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … | 核,工业放射学和生物… | 0.430537 | 0.430537 |
ACCOUNTING | 会计 | 0.253583 | 0.253583 |
MATHEMATICS | 数学 | 0.244103 | 0.244103 |
PHYSICS | 物理 | 0.448099 | 0.448099 |
MEDICAL TECHNOLOGIES TECHNICIANS | 医疗技术人员 | 0.434298 | 0.434298 |
STATISTICS AND DECISION SCIENCE | 统计与决策科学 | 0.281936 | 0.281936 |
ENGINEERING AND INDUSTRIAL MANAGEMENT | 工程与工业管理 | 0.174123 | 0.174123 |
MEDICAL ASSISTING SERVICES | 医疗辅助服务 | 0.178982 | 0.178982 |
COMPUTER PROGRAMMING AND DATA PROCESSING | 计算机编程和数据处理 | 0.269194 | 0.269194 |
GENERAL BUSINESS | 一般业务 | 0.417925 | 0.417925 |
ARCHITECTURE | 建筑 | 0.321770 | 0.321770 |
INTERNATIONAL BUSINESS | 国际业务 | 0.282903 | 0.282903 |
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… | 药房药物科学和行政管理 | 0.451465 | 0.451465 |
MOLECULAR BIOLOGY | 分子生物学 | 0.077453 | 0.077453 |
MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION | 其他商业和医疗管理 | 0.200023 | 0.200023 |
MISCELLANEOUS ENGINEERING TECHNOLOGIES | 杂项工程技术 | 0.000000 | 0.000000 |
MECHANICAL ENGINEERING RELATED TECHNOLOGIES | 机械工程相关技术 | 0.377437 | 0.377437 |
INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY | 工业和组织心理学 | 0.436302 | 0.436302 |
PHYSICAL SCIENCES | 物理科学 | 0.426924 | 0.426924 |
MILITARY TECHNOLOGIES | 军事技术 | 0.429685 | 0.429685 |
ELECTRICAL, MECHANICAL, AND PRECISION TECHNOLO… | 电气,机械和精密技术 | 0.232444 | 0.232444 |
MARKETING AND MARKETING RESEARCH | 市场营销研究 | 0.382900 | 0.382900 |
POLITICAL SCIENCE AND GOVERNMENT | 政治科学与政府 | 0.485930 | 0.485930 |
GEOGRAPHY | 地理 | 0.473190 | 0.473190 |
COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY | 计算机管理管理与安全 | 0.180883 | 0.180883 |
COMPUTER NETWORKING AND TELECOMMUNICATIONS | 计算机网络与电信 | 0.305005 | 0.305005 |
GEOLOGY AND EARTH SCIENCE | 地质与地球科学 | 0.470197 | 0.470197 |
PUBLIC ADMINISTRATION | 公共行政 | 0.476461 | 0.476461 |
COMMUNICATIONS | 通讯方式 | 0.305109 | 0.305109 |
CRIMINAL JUSTICE AND FIRE PROTECTION | 刑事司法与消防 | 0.125035 | 0.125035 |
COMMERCIAL ART AND GRAPHIC DESIGN | 商业艺术与图形设计 | 0.374356 | 0.374356 |
SPECIAL NEEDS EDUCATION | 特殊需求教育 | 0.366177 | 0.366177 |
TRANSPORTATION SCIENCES AND TECHNOLOGIES | 交通科学与技术 | 0.321296 | 0.321296 |
NEUROSCIENCE | 神经科学 | 0.475010 | 0.475010 |
MULTI/INTERDISCIPLINARY STUDIES | 跨学科研究 | 0.495397 | 0.495397 |
ATMOSPHERIC SCIENCES AND METEOROLOGY | 大气科学与气象 | 0.124950 | 0.124950 |
EDUCATIONAL ADMINISTRATION AND SUPERVISION | 教育行政与监督 | 0.448732 | 0.448732 |
PHILOSOPHY AND RELIGIOUS STUDIES | 哲学与宗教研究 | 0.416810 | 0.416810 |
ENGLISH LANGUAGE AND LITERATURE | 英语语言与文学 | 0.339671 | 0.339671 |
SCIENCE AND COMPUTER TEACHER EDUCATION | 科学与计算机教师教育 | 0.423209 | 0.423209 |
MUSIC | 音乐 | 0.444582 | 0.444582 |
COSMETOLOGY SERVICES AND CULINARY ARTS | 美容服务和烹饪 | 0.383719 | 0.383719 |
To filter rows by specific criteria, we need to use the WHERE
statement. A simple WHERE
statement requires three things:
要按特定条件过滤行,我们需要使用WHERE
语句。 一个简单的WHERE
语句需要三件事:
ShareWomen
>
0.5
ShareWomen
>
0.5
Here are the comparison operators we can use:
这是我们可以使用的比较运算符:
<
<=
>
>=
=
!=
<
<=
>
>=
=
!=
The comparison value after the operator must be either text or a number, depending on the field. Because ShareWomen
is a numeric column, we don’t need to enclose the number 0.5
in quotes. Finally, most database systems require that the SELECT
and FROM
statements come first, before WHERE
or any other statements.
取决于字段,运算符后的比较值必须为文本或数字。 由于ShareWomen
是一个数字列,因此我们不需要将数字0.5
括在引号中。 最后,大多数数据库系统要求SELECT
和FROM
语句在WHERE
或任何其他语句之前排在最前面。
We can use the AND
operator to combine multiple filter criteria. For example, to determine which engineering majors had majority female, we’d need to specify 2 filtering criteria.
我们可以使用AND
运算符组合多个过滤条件。 例如,要确定哪个工程专业的女性占多数,我们需要指定2个过滤条件。
Major | 重大的 |
---|---|
ENVIRONMENTAL ENGINEERING | 环境工程 |
INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 |
It looks like only 2 majors met this criteria. If we wanted to “zoom” back out to look at all of the columns for both of these majors to see if they shared some other common attributes, we can modify the SELECT
statement and use the symbol *
to represent all columns
看来只有2个专业达到了这个标准。 如果我们想“放大”以查看这两个专业的所有列以查看它们是否共享其他共同的属性,则可以修改SELECT
语句并使用符号*
表示所有列
SELECT SELECT * * FROM FROM recent_gradsrecent_gradsWHERE WHERE Major_category Major_category = = 'Engineering' 'Engineering' AND AND ShareWomen ShareWomen > > 0.50.5
index | 指数 | Rank | 秩 | Major_code | 专业代码 | Major | 重大的 | Major_category | 专业类别 | Total | 总 | Sample_size | 样本大小 | Men | 男装 | Women | 女装 | ShareWomen | 分享女性 | Employed | 受雇 | Full_time | 全职 | Part_time | 兼职 | Full_time_year_round | Full_time_year_round | Unemployed | 待业 | Unemployment_rate | 失业率 | Median | 中位数 | P25th | P25th | P75th | P75th | College_jobs | 大学工作 | Non_college_jobs | 非大学工作 | Low_wage_jobs | 低薪工作 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
30 | 30 | 31 | 31 | 2410 | 2410 | ENVIRONMENTAL ENGINEERING | 环境工程 | Engineering | 工程 | 4047 | 4047 | 26 | 26 | 2639 | 2639 | 3339 | 3339 | 0.558548 | 0.558548 | 2983 | 2983 | 2384 | 2384 | 930 | 930 | 1951 | 1951年 | 308 | 308 | 0.093589 | 0.093589 | 50000 | 50000 | 42000 | 42000 | 56000 | 56000 | 2028 | 2028年 | 830 | 830 | 260 | 260 |
38 | 38 | 39 | 39 | 2503 | 2503 | INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 | Engineering | 工程 | 4631 | 4631 | 73 | 73 | 528 | 528 | 1588 | 1588 | 0.750473 | 0.750473 | 4428 | 4428 | 3988 | 3988 | 597 | 597 | 3242 | 3242 | 129 | 129 | 0.028308 | 0.028308 | 46000 | 46000 | 35000 | 35000 | 65000 | 65000 | 1394 | 1394 | 2454 | 2454 | 480 | 480 |
The ability to quickly iterate on queries as you think of new questions is the appeal of SQL. The SQL workflow lets data professionals focus on asking and answering questions, instead of lower level programming concepts. There’s a clear separation of concerns between the engine that stores, organizes, and retrieves the data and the language that let’s people interface with the data easily.
当您想到新问题时,快速迭代查询的能力是SQL的吸引力。 SQL工作流使数据专业人员可以专注于提问和回答问题,而不是底层编程概念。 存储,组织和检索数据的引擎与使人们轻松地与数据交互的语言之间存在明显的关注点分离。
As the scale of data has increased, engineers have maintained the interface of SQL while swapping out the database engine underneath. This allows people who need to ask and answer questions to easily transfer their SQL experience, even as database technologies change. For example, the lets you query using SQL but use data from database systems like MySQL, from a distributed file system like HDFS, and more.
随着数据规模的增加,工程师在更换下面的数据库引擎的同时维护了SQL的接口。 这使需要询问和回答问题的人员可以轻松地转移他们SQL经验,即使数据库技术发生了变化。 例如, 使您可以使用SQL查询,但可以使用来自数据库系统(如MySQL),来自分布式文件系统(如HDFS)等的数据。
Let’s write a SQL query that returns all majors with majority female and all majors had a median salary greater than 50000
. Let’s only include the following columns in the results and in this order:
让我们编写一个SQL查询,该查询返回所有具有女性多数的专业, 并且所有专业的中位薪水均大于50000
。 让我们仅在结果中按顺序包括以下几列:
Major | 重大的 | Major_category | 专业类别 | Median | 中位数 | ShareWomen | 分享女性 |
---|---|---|---|---|---|---|---|
ACTUARIAL SCIENCE | 精密科学 | Business | 商业 | 62000 | 62000 | 0.535714 | 0.535714 |
COMPUTER SCIENCE | 计算机科学 | Computers & Mathematics | 电脑与数学 | 53000 | 53000 | 0.578766 | 0.578766 |
We used the AND
operator to specify that our filter needs to pass two Boolean conditions. Both of the conditions had to evaluate to True
for the record to appear in the result set. If we wanted to specify a filter that meets either of the conditions instead, we would use the OR
operator.
我们使用AND
运算符来指定我们的过滤器需要传递两个布尔条件。 为了使记录出现在结果集中,两个条件都必须评估为True
。 如果我们想指定一个满足任一条件的过滤器,则可以使用OR
运算符。
SELECT [column1, column2,...] FROM [table1]WHERE [condition1] OR [condition2]SELECT [column1, column2,...] FROM [table1]WHERE [condition1] OR [condition2]
We’ll dive straight into a practice problem because we use the OR
and AND
operators in similar ways.
我们将直接探讨实践问题,因为我们以类似的方式使用OR
和AND
运算符。
Write a SQL query that returns the first 20 majors that either have a Median
salary greater than or equal to 10,000
, or have less than or equal to 1,000
Unemployed
people. Let’s only include the following columns in the results and in this order:
编写一个SQL查询,返回前20个专业, 这些专业的Median
工资Median
大于或等于10,000
或 Unemployed
小于或等于1,000
。 让我们仅在结果中按顺序包括以下几列:
Major
Median
Unemployed
Major
Median
Unemployed
Major | 重大的 | Median | 中位数 | Unemployed | 待业 |
---|---|---|---|---|---|
PETROLEUM ENGINEERING | 石油工程师 | 110000 | 110000 | 37 | 37 |
MINING AND MINERAL ENGINEERING | 采矿与矿物工程 | 75000 | 75000 | 85 | 85 |
METALLURGICAL ENGINEERING | 冶金工程 | 73000 | 73000 | 16 | 16 |
NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | 70000 | 70000 | 40 | 40 |
CHEMICAL ENGINEERING | 化学工程 | 65000 | 65000 | 1672 | 1672 |
NUCLEAR ENGINEERING | 核工程 | 65000 | 65000 | 400 | 400 |
ACTUARIAL SCIENCE | 精密科学 | 62000 | 62000 | 308 | 308 |
ASTRONOMY AND ASTROPHYSICS | 天文学与天体物理学 | 62000 | 62000 | 33 | 33 |
MECHANICAL ENGINEERING | 机械工业 | 60000 | 60000 | 4650 | 4650 |
ELECTRICAL ENGINEERING | 电机工程 | 60000 | 60000 | 3895 | 3895 |
COMPUTER ENGINEERING | 计算机工程 | 60000 | 60000 | 2275 | 2275 |
AEROSPACE ENGINEERING | 航空航天工程 | 60000 | 60000 | 794 | 794 |
BIOMEDICAL ENGINEERING | 生物医学工程 | 60000 | 60000 | 1019 | 1019 |
MATERIALS SCIENCE | 材料科学 | 60000 | 60000 | 78 | 78 |
ENGINEERING MECHANICS PHYSICS AND SCIENCE | 工程力学物理与科学 | 58000 | 58000 | 23 | 23 |
BIOLOGICAL ENGINEERING | 生物工程 | 57100 | 57100 | 589 | 589 |
INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | 57000 | 57000 | 699 | 699 |
GENERAL ENGINEERING | 一般工程 | 56000 | 56000 | 2859 | 2859 |
ARCHITECTURAL ENGINEERING | 建筑工程 | 54000 | 54000 | 170 | 170 |
COURT REPORTING | 法院报告 | 54000 | 54000 | 11 | 11 |
There’s a certain class of questions that we can’t answer using only the techniques we’ve learned so far. For example, if we wanted to write a query that returned all Engineering
majors that either had mostly female graduates or an unemployment rate below 5.1%, we would need to use parentheses to express this more complex logic.
我们仅使用到目前为止已经掌握的技术就无法回答某些问题。 例如,如果我们想编写一个返回所有的查询Engineering
, 要么有大部分是女毕业生或低于5.1%的失业率的专业,我们需要使用括号来表达这种更复杂的逻辑。
The three raw conditions we’ll need are:
我们需要的三个原始条件是:
Major_category = 'Engineering'ShareWomen >= 0.5Unemployment_rate < 0.051Major_category = 'Engineering'ShareWomen >= 0.5Unemployment_rate < 0.051
What the SQL query looks like using parantheses:
使用括号时,SQL查询的外观如下:
The first thing you may notice is that we didn’t capitalize any of the operators or statements in the query. SQL’s built-in keywords are case-insensitive, which means we don’t have to capitalize operators like AND
or statements like SELECT
. This also goes for the column names (you can use either major_category
or Major_category
). We’ll stick to using capitalized SQL and the original column names to stay consistent.
您可能会注意到的第一件事是,我们没有将查询中的任何运算符或语句大写。 SQL的内置关键字不区分大小写,这意味着我们不必大写诸如AND
类的运算符或诸如SELECT
类的SELECT
。 列名称也是如此(您可以使用major_category
或Major_category
)。 我们将坚持使用大写SQL和原始列名保持一致。
The second thing you may notice is how we enclosed the logic we wanted to be evaluated together in parentheses. This is very similar to how we group mathematical calculations together in a particular order. The parentheses makes it explictly clear to the database that we want all of the rows where both of the expressions in the statements evaluate to True
:
您可能会注意到的第二件事是,我们如何将要评估的逻辑放在括号中。 这非常类似于我们按特定顺序将数学计算分组在一起的方式。 括号使数据库清楚地知道,我们希望所有行中语句中两个表达式的求和结果都为True
:
(Major_category = 'Engineering' AND ShareWomen > 0.5) -> True or False(ShareWomen > 0.5 OR Unemployment_rate < 0.051) -> True or False(Major_category = 'Engineering' AND ShareWomen > 0.5) -> True or False(ShareWomen > 0.5 OR Unemployment_rate < 0.051) -> True or False
If we had written the where
statement without any parentheses, the database would guess what our intentions are, and actually execute the following query instead:
如果我们编写了不带括号的where
语句,则数据库将猜测我们的意图,并实际上执行以下查询:
Leaving the parentheses out implies that we want the calculation to happen from left to right in the order in which the logic is written, and wouldn’t return us the data we want. Now let’s run our intended query and see the results!
省略括号意味着我们希望计算以逻辑编写的顺序从左到右进行,并且不会向我们返回所需的数据。 现在,让我们运行预期的查询并查看结果!
Let’s run the query we explored above, which returns all Engineering
majors that either had mostly women graduates or had an unemployment rate below 5.1%, which was the rate in August 2015. Let’s only include the following columns in the results and in this order:
让我们运行上面探索的查询,该查询返回所有Engineering
专业的学生, 这些学生大多数是女性毕业生, 或者失业率低于5.1% ,即2015年8月的失业率。我们仅在结果中按以下顺序包括以下几列:
Major
Major_category
ShareWomen
Unemployment_rate
Major
Major_category
ShareWomen
Unemployment_rate
SELECT SELECT MajorMajor , , Major_categoryMajor_category , , ShareWomenShareWomen , , Unemployment_rateUnemployment_rateFROM FROM recent_gradsrecent_gradsWHERE WHERE (( Major_category Major_category = = 'Engineering''Engineering' ) ) AND AND (( ShareWomen ShareWomen > > 0.5 0.5 OR OR Unemployment_rate Unemployment_rate < < 0.0510.051 ))
Major | 重大的 | Major_category | 专业类别 | ShareWomen | 分享女性 | Unemployment_rate | 失业率 |
---|---|---|---|---|---|---|---|
PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 0.120564 | 0.120564 | 0.018381 | 0.018381 |
METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 0.153037 | 0.153037 | 0.024096 | 0.024096 |
NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 0.107313 | 0.107313 | 0.050125 | 0.050125 |
MATERIALS SCIENCE | 材料科学 | Engineering | 工程 | 0.310820 | 0.310820 | 0.023043 | 0.023043 |
ENGINEERING MECHANICS PHYSICS AND SCIENCE | 工程力学物理与科学 | Engineering | 工程 | 0.183985 | 0.183985 | 0.006334 | 0.006334 |
INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | Engineering | 工程 | 0.343473 | 0.343473 | 0.042876 | 0.042876 |
MATERIALS ENGINEERING AND MATERIALS SCIENCE | 材料工程与材料科学 | Engineering | 工程 | 0.292607 | 0.292607 | 0.027789 | 0.027789 |
ENVIRONMENTAL ENGINEERING | 环境工程 | Engineering | 工程 | 0.558548 | 0.558548 | 0.093589 | 0.093589 |
INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 | Engineering | 工程 | 0.750473 | 0.750473 | 0.028308 | 0.028308 |
ENGINEERING AND INDUSTRIAL MANAGEMENT | 工程与工业管理 | Engineering | 工程 | 0.174123 | 0.174123 | 0.033652 | 0.033652 |
The results of every query we’ve written so far have been ordered by the Rank
column. Recall a query from early in the post, where we wrote a query that returned all of the columns and didn’t filter rows on any specific criteria:
到目前为止,我们编写的每个查询的结果均按“ Rank
列排序。 回想一下文章开头的查询,我们编写了一个查询,该查询返回了所有列,并且未根据任何特定条件过滤行:
index | 指数 | Rank | 秩 | Major_code | 专业代码 | Major | 重大的 | Major_category | 专业类别 | Total | 总 | Sample_size | 样本大小 | Men | 男装 | Women | 女装 | ShareWomen | 分享女性 | Employed | 受雇 | Full_time | 全职 | Part_time | 兼职 | Full_time_year_round | Full_time_year_round | Unemployed | 待业 | Unemployment_rate | 失业率 | Median | 中位数 | P25th | P25th | P75th | P75th | College_jobs | 大学工作 | Non_college_jobs | 非大学工作 | Low_wage_jobs | 低薪工作 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 1个 | 2419 | 2419 | PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 2339 | 2339 | 36 | 36 | 2057 | 2057年 | 282 | 282 | 0.120564 | 0.120564 | 1976 | 1976年 | 1849 | 1849年 | 270 | 270 | 1207 | 1207 | 37 | 37 | 0.018381 | 0.018381 | 110000 | 110000 | 95000 | 95000 | 125000 | 125000 | 1534 | 1534 | 364 | 364 | 193 | 193 |
1 | 1个 | 2 | 2 | 2416 | 2416 | MINING AND MINERAL ENGINEERING | 采矿与矿物工程 | Engineering | 工程 | 756 | 756 | 7 | 7 | 679 | 679 | 77 | 77 | 0.101852 | 0.101852 | 640 | 640 | 556 | 556 | 170 | 170 | 388 | 388 | 85 | 85 | 0.117241 | 0.117241 | 75000 | 75000 | 55000 | 55000 | 90000 | 90000 | 350 | 350 | 257 | 257 | 50 | 50 |
2 | 2 | 3 | 3 | 2415 | 2415 | METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 856 | 856 | 3 | 3 | 725 | 725 | 131 | 131 | 0.153037 | 0.153037 | 648 | 648 | 558 | 558 | 133 | 133 | 340 | 340 | 16 | 16 | 0.024096 | 0.024096 | 73000 | 73000 | 50000 | 50000 | 105000 | 105000 | 456 | 456 | 176 | 176 | 0 | 0 |
3 | 3 | 4 | 4 | 2417 | 2417 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 1258 | 1258 | 16 | 16 | 1123 | 1123 | 135 | 135 | 0.107313 | 0.107313 | 758 | 758 | 1069 | 1069 | 150 | 150 | 692 | 692 | 40 | 40 | 0.050125 | 0.050125 | 70000 | 70000 | 43000 | 43000 | 80000 | 80000 | 529 | 529 | 102 | 102 | 0 | 0 |
4 | 4 | 5 | 5 | 2405 | 2405 | CHEMICAL ENGINEERING | 化学工程 | Engineering | 工程 | 32260 | 32260 | 289 | 289 | 21239 | 21239 | 11021 | 11021 | 0.341631 | 0.341631 | 25694 | 25694 | 23170 | 23170 | 5180 | 5180 | 16697 | 16697 | 1672 | 1672 | 0.061098 | 0.061098 | 65000 | 65000 | 50000 | 50000 | 75000 | 75000 | 18314 | 18314 | 4440 | 4440 | 972 | 972 |
As the questions we want to answer get more complex, we want more control over how the results are ordered. We can specify the order using the clause. For example, we may want to understand which majors that met the criteria in the WHERE
statement had the lowest unemployment rate. The following query will return the results in ascending order by the Unemployment_rate
column.
随着我们要回答的问题变得越来越复杂,我们希望对结果的排序方式有更多的控制。 我们可以使用子句指定顺序。 例如,我们可能想了解哪些符合WHERE
陈述标准的专业失业率最低。 以下查询将按Unemployment_rate
列的升序返回结果。
SELECT SELECT RankRank , , MajorMajor , , Major_categoryMajor_category , , ShareWomenShareWomen , , Unemployment_rateUnemployment_rateFROM FROM recent_gradsrecent_gradsWHERE WHERE (( Major_category Major_category = = 'Engineering''Engineering' ) ) AND AND (( ShareWomen ShareWomen > > 0.5 0.5 OR OR Unemployment_rate Unemployment_rate < < 0.0510.051 ))ORDER ORDER BY BY Unemployment_rateUnemployment_rate
Rank | 秩 | Major | 重大的 | Major_category | 专业类别 | ShareWomen | 分享女性 | Unemployment_rate | 失业率 |
---|---|---|---|---|---|---|---|---|---|
15 | 15 | ENGINEERING MECHANICS PHYSICS AND SCIENCE | 工程力学物理与科学 | Engineering | 工程 | 0.183985 | 0.183985 | 0.006334 | 0.006334 |
1 | 1个 | PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 0.120564 | 0.120564 | 0.018381 | 0.018381 |
14 | 14 | MATERIALS SCIENCE | 材料科学 | Engineering | 工程 | 0.310820 | 0.310820 | 0.023043 | 0.023043 |
3 | 3 | METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 0.153037 | 0.153037 | 0.024096 | 0.024096 |
24 | 24 | MATERIALS ENGINEERING AND MATERIALS SCIENCE | 材料工程与材料科学 | Engineering | 工程 | 0.292607 | 0.292607 | 0.027789 | 0.027789 |
39 | 39 | INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 | Engineering | 工程 | 0.750473 | 0.750473 | 0.028308 | 0.028308 |
51 | 51 | ENGINEERING AND INDUSTRIAL MANAGEMENT | 工程与工业管理 | Engineering | 工程 | 0.174123 | 0.174123 | 0.033652 | 0.033652 |
17 | 17 | INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | Engineering | 工程 | 0.343473 | 0.343473 | 0.042876 | 0.042876 |
4 | 4 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 0.107313 | 0.107313 | 0.050125 | 0.050125 |
31 | 31 | ENVIRONMENTAL ENGINEERING | 环境工程 | Engineering | 工程 | 0.558548 | 0.558548 | 0.093589 | 0.093589 |
If we instead want the results ordered by the same column but in descending order, we can add the DESC
keyword:
相反,如果我们希望结果按同一列但以降序排列,则可以添加DESC
关键字:
Rank | 秩 | Major | 重大的 | Major_category | 专业类别 | ShareWomen | 分享女性 | Unemployment_rate | 失业率 |
---|---|---|---|---|---|---|---|---|---|
31 | 31 | ENVIRONMENTAL ENGINEERING | 环境工程 | Engineering | 工程 | 0.558548 | 0.558548 | 0.093589 | 0.093589 |
4 | 4 | NAVAL ARCHITECTURE AND MARINE ENGINEERING | 海军建筑与海洋工程 | Engineering | 工程 | 0.107313 | 0.107313 | 0.050125 | 0.050125 |
17 | 17 | INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | Engineering | 工程 | 0.343473 | 0.343473 | 0.042876 | 0.042876 |
51 | 51 | ENGINEERING AND INDUSTRIAL MANAGEMENT | 工程与工业管理 | Engineering | 工程 | 0.174123 | 0.174123 | 0.033652 | 0.033652 |
39 | 39 | INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 | Engineering | 工程 | 0.750473 | 0.750473 | 0.028308 | 0.028308 |
24 | 24 | MATERIALS ENGINEERING AND MATERIALS SCIENCE | 材料工程与材料科学 | Engineering | 工程 | 0.292607 | 0.292607 | 0.027789 | 0.027789 |
3 | 3 | METALLURGICAL ENGINEERING | 冶金工程 | Engineering | 工程 | 0.153037 | 0.153037 | 0.024096 | 0.024096 |
14 | 14 | MATERIALS SCIENCE | 材料科学 | Engineering | 工程 | 0.310820 | 0.310820 | 0.023043 | 0.023043 |
1 | 1个 | PETROLEUM ENGINEERING | 石油工程师 | Engineering | 工程 | 0.120564 | 0.120564 | 0.018381 | 0.018381 |
15 | 15 | ENGINEERING MECHANICS PHYSICS AND SCIENCE | 工程力学物理与科学 | Engineering | 工程 | 0.183985 | 0.183985 | 0.006334 | 0.006334 |
Let’s write a query that returns all majors where ShareWomen
is greater than 0.3
and Unemployment_rate
is less than .1
. Let’s only include the following columns in the results and in this order:
让我们编写一个查询,该查询返回ShareWomen
大于0.3
且Unemployment_rate
小于.1
所有专业。 让我们仅在结果中按顺序包括以下几列:
Major
,ShareWomen
,Unemployment_rate
Major
ShareWomen
, Unemployment_rate
We’ll order the results in descending order by the ShareWomen
column.
我们ShareWomen
列的降序排列结果。
SELECT SELECT MajorMajor , , ShareWomenShareWomen , , Unemployment_rate Unemployment_rate FROM FROM recent_gradsrecent_gradsWHERE WHERE ShareWomen ShareWomen > > 0.3 0.3 AND AND Unemployment_rate Unemployment_rate < < .. 11ORDER ORDER BY BY ShareWomen ShareWomen DESCDESC
Major | 重大的 | ShareWomen | 分享女性 | Unemployment_rate | 失业率 |
---|---|---|---|---|---|
EARLY CHILDHOOD EDUCATION | 早期儿童教育 | 0.967998 | 0.967998 | 0.040105 | 0.040105 |
MATHEMATICS AND COMPUTER SCIENCE | 数学与计算机科学 | 0.927807 | 0.927807 | 0.000000 | 0.000000 |
ELEMENTARY EDUCATION | 小学教育 | 0.923745 | 0.923745 | 0.046586 | 0.046586 |
ANIMAL SCIENCES | 动物科学 | 0.910933 | 0.910933 | 0.050862 | 0.050862 |
PHYSIOLOGY | 生理 | 0.906677 | 0.906677 | 0.069163 | 0.069163 |
MISCELLANEOUS PSYCHOLOGY | 杂项心理学 | 0.905590 | 0.905590 | 0.051908 | 0.051908 |
HUMAN SERVICES AND COMMUNITY ORGANIZATION | 人类服务与社区组织 | 0.904075 | 0.904075 | 0.037819 | 0.037819 |
NURSING | 护理 | 0.896019 | 0.896019 | 0.044863 | 0.044863 |
GEOSCIENCES | 地球科学 | 0.881294 | 0.881294 | 0.024374 | 0.024374 |
MASS MEDIA | 媒体 | 0.877228 | 0.877228 | 0.089837 | 0.089837 |
COGNITIVE SCIENCE AND BIOPSYCHOLOGY | 认知科学与生物心理学 | 0.854523 | 0.854523 | 0.075236 | 0.075236 |
ART HISTORY AND CRITICISM | 艺术史与批评 | 0.845934 | 0.845934 | 0.060298 | 0.060298 |
EDUCATIONAL PSYCHOLOGY | 教育心理学 | 0.817099 | 0.817099 | 0.065112 | 0.065112 |
GENERAL EDUCATION | 普通教育 | 0.812877 | 0.812877 | 0.057360 | 0.057360 |
SOCIAL WORK | 社会工作 | 0.810704 | 0.810704 | 0.068828 | 0.068828 |
TEACHER EDUCATION: MULTIPLE LEVELS | 教师教育:多个层次 | 0.798920 | 0.798920 | 0.036546 | 0.036546 |
COUNSELING PSYCHOLOGY | 心理咨询 | 0.798746 | 0.798746 | 0.053621 | 0.053621 |
MATHEMATICS TEACHER EDUCATION | 数学教师教育 | 0.792095 | 0.792095 | 0.016203 | 0.016203 |
PSYCHOLOGY | 心理学 | 0.779933 | 0.779933 | 0.083811 | 0.083811 |
GENERAL MEDICAL AND HEALTH SERVICES | 一般医疗卫生 | 0.774577 | 0.774577 | 0.082102 | 0.082102 |
HEALTH AND MEDICAL ADMINISTRATIVE SERVICES | 卫生和医疗行政服务 | 0.770901 | 0.770901 | 0.089626 | 0.089626 |
SOIL SCIENCE | 土壤科学 | 0.764427 | 0.764427 | 0.000000 | 0.000000 |
AREA ETHNIC AND CIVILIZATION STUDIES | 地区民族与文明研究 | 0.758060 | 0.758060 | 0.063429 | 0.063429 |
APPLIED MATHEMATICS | 应用数学 | 0.753927 | 0.753927 | 0.090823 | 0.090823 |
FAMILY AND CONSUMER SCIENCES | 家庭和消费者科学 | 0.752144 | 0.752144 | 0.067128 | 0.067128 |
INDUSTRIAL PRODUCTION TECHNOLOGIES | 工业生产技术 | 0.750473 | 0.750473 | 0.028308 | 0.028308 |
SOCIAL PSYCHOLOGY | 社会心理学 | 0.747561 | 0.747561 | 0.029650 | 0.029650 |
HUMANITIES | 人文学科 | 0.745662 | 0.745662 | 0.068584 | 0.068584 |
HOSPITALITY MANAGEMENT | 接待管理 | 0.733992 | 0.733992 | 0.061169 | 0.061169 |
SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION | 社会科学或历史教师教育 | 0.733968 | 0.733968 | 0.054083 | 0.054083 |
THEOLOGY AND RELIGIOUS VOCATIONS | 神学和宗教职业 | 0.728495 | 0.728495 | 0.062628 | 0.062628 |
FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN L… | 法国德语拉丁语和其他常见外国语言 | 0.728033 | 0.728033 | 0.075566 | 0.075566 |
INTERDISCIPLINARY SOCIAL SCIENCES | 跨学科社会科学 | 0.721866 | 0.721866 | 0.092306 | 0.092306 |
MISCELLANEOUS AGRICULTURE | 其他农业 | 0.719974 | 0.719974 | 0.059767 | 0.059767 |
JOURNALISM | 新闻学 | 0.719859 | 0.719859 | 0.069176 | 0.069176 |
MISCELLANEOUS EDUCATION | 杂项教育 | 0.718365 | 0.718365 | 0.059212 | 0.059212 |
COMPUTER AND INFORMATION SYSTEMS | 计算机与信息系统 | 0.707719 | 0.707719 | 0.093460 | 0.093460 |
COMMUNICATION DISORDERS SCIENCES AND SERVICES | 通信疾病科学与服务 | 0.707136 | 0.707136 | 0.047584 | 0.047584 |
MISCELLANEOUS HEALTH MEDICAL PROFESSIONS | 其他健康医疗专业 | 0.702020 | 0.702020 | 0.081411 | 0.081411 |
LIBERAL ARTS | 大量的美术作品 | 0.700898 | 0.700898 | 0.078268 | 0.078268 |
FORESTRY | 林业 | 0.690365 | 0.690365 | 0.096726 | 0.096726 |
OCEANOGRAPHY | 海洋学 | 0.688999 | 0.688999 | 0.056995 | 0.056995 |
ART AND MUSIC EDUCATION | 艺术和音乐教育 | 0.686024 | 0.686024 | 0.038638 | 0.038638 |
PHYSICAL FITNESS PARKS RECREATION AND LEISURE | 健身公园的休闲娱乐 | 0.683943 | 0.683943 | 0.051467 | 0.051467 |
ADVERTISING AND PUBLIC RELATIONS | 广告与公共关系 | 0.673143 | 0.673143 | 0.067961 | 0.067961 |
HUMAN RESOURCES AND PERSONNEL MANAGEMENT | 人力资源和人事管理 | 0.672161 | 0.672161 | 0.059570 | 0.059570 |
MULTI-DISCIPLINARY OR GENERAL SCIENCE | 多学科或通用科学 | 0.669999 | 0.669999 | 0.055807 | 0.055807 |
FINE ARTS | 精美艺术 | 0.667034 | 0.667034 | 0.084186 | 0.084186 |
COMPOSITION AND RHETORIC | 成分和修辞 | 0.666119 | 0.666119 | 0.081742 | 0.081742 |
HISTORY | 历史 | 0.651741 | 0.651741 | 0.095667 | 0.095667 |
ECOLOGY | 生态 | 0.651660 | 0.651660 | 0.054475 | 0.054475 |
GENETICS | 遗传学 | 0.643331 | 0.643331 | 0.034118 | 0.034118 |
TREATMENT THERAPY PROFESSIONS | 治疗专业 | 0.640000 | 0.640000 | 0.059821 | 0.059821 |
NUTRITION SCIENCES | 营养科学 | 0.638147 | 0.638147 | 0.068701 | 0.068701 |
ZOOLOGY | 动物学 | 0.637293 | 0.637293 | 0.046320 | 0.046320 |
INTERNATIONAL RELATIONS | 国际关系 | 0.632987 | 0.632987 | 0.096799 | 0.096799 |
UNITED STATES HISTORY | 美国历史 | 0.630716 | 0.630716 | 0.047179 | 0.047179 |
DRAMA AND THEATER ARTS | 戏剧和戏剧艺术 | 0.629505 | 0.629505 | 0.077541 | 0.077541 |
CRIMINOLOGY | 犯罪学 | 0.618223 | 0.618223 | 0.097244 | 0.097244 |
MICROBIOLOGY | 微生物学 | 0.615727 | 0.615727 | 0.066776 | 0.066776 |
PLANT SCIENCE AND AGRONOMY | 植物科学与农艺学 | 0.606889 | 0.606889 | 0.045455 | 0.045455 |
BIOLOGY | 生物学 | 0.601858 | 0.601858 | 0.070725 | 0.070725 |
SECONDARY TEACHER EDUCATION | 中学教师教育 | 0.601752 | 0.601752 | 0.052229 | 0.052229 |
AGRICULTURE PRODUCTION AND MANAGEMENT | 农业生产与管理 | 0.594208 | 0.594208 | 0.050031 | 0.050031 |
PRE-LAW AND LEGAL STUDIES | 法律前和法律研究 | 0.591001 | 0.591001 | 0.071965 | 0.071965 |
AGRICULTURAL ECONOMICS | 农业经济学 | 0.589712 | 0.589712 | 0.077250 | 0.077250 |
STUDIO ARTS | 工作室艺术 | 0.584776 | 0.584776 | 0.089552 | 0.089552 |
ENVIRONMENTAL SCIENCE | 环境科学 | 0.584556 | 0.584556 | 0.078585 | 0.078585 |
BUSINESS MANAGEMENT AND ADMINISTRATION | 业务管理与行政 | 0.580948 | 0.580948 | 0.072218 | 0.072218 |
COMPUTER SCIENCE | 计算机科学 | 0.578766 | 0.578766 | 0.063173 | 0.063173 |
LANGUAGE AND DRAMA EDUCATION | 语言和戏剧教育 | 0.576360 | 0.576360 | 0.050306 | 0.050306 |
MISCELLANEOUS BIOLOGY | 其他生物学 | 0.566641 | 0.566641 | 0.058545 | 0.058545 |
NATURAL RESOURCES MANAGEMENT | 自然资源管理 | 0.564639 | 0.564639 | 0.066619 | 0.066619 |
ENVIRONMENTAL ENGINEERING | 环境工程 | 0.558548 | 0.558548 | 0.093589 | 0.093589 |
HEALTH AND MEDICAL PREPARATORY PROGRAMS | 卫生和医疗准备计划 | 0.556604 | 0.556604 | 0.069780 | 0.069780 |
MISCELLANEOUS SOCIAL SCIENCES | 其他社会科学 | 0.543405 | 0.543405 | 0.073080 | 0.073080 |
ACTUARIAL SCIENCE | 精密科学 | 0.535714 | 0.535714 | 0.095652 | 0.095652 |
SOCIOLOGY | 社会学 | 0.532334 | 0.532334 | 0.084951 | 0.084951 |
BOTANY | 植物学 | 0.528969 | 0.528969 | 0.000000 | 0.000000 |
INFORMATION SCIENCES | 信息科学 | 0.526476 | 0.526476 | 0.060741 | 0.060741 |
PHARMACOLOGY | 药理 | 0.524153 | 0.524153 | 0.085532 | 0.085532 |
GENERAL AGRICULTURE | 普通农业 | 0.515543 | 0.515543 | 0.019642 | 0.019642 |
BIOCHEMICAL SCIENCES | 生化科学 | 0.515406 | 0.515406 | 0.080531 | 0.080531 |
INTERCULTURAL AND INTERNATIONAL STUDIES | 文化间和国际研究 | 0.507377 | 0.507377 | 0.083634 | 0.083634 |
PHYSICAL AND HEALTH EDUCATION TEACHING | 体育与健康教育教学 | 0.506721 | 0.506721 | 0.074667 | 0.074667 |
CHEMISTRY | 化学 | 0.505141 | 0.505141 | 0.053972 | 0.053972 |
MULTI/INTERDISCIPLINARY STUDIES | 跨学科研究 | 0.495397 | 0.495397 | 0.070861 | 0.070861 |
NEUROSCIENCE | 神经科学 | 0.475010 | 0.475010 | 0.048482 | 0.048482 |
GEOLOGY AND EARTH SCIENCE | 地质与地球科学 | 0.470197 | 0.470197 | 0.075449 | 0.075449 |
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… | 药房药物科学和行政管理 | 0.451465 | 0.451465 | 0.055521 | 0.055521 |
EDUCATIONAL ADMINISTRATION AND SUPERVISION | 教育行政与监督 | 0.448732 | 0.448732 | 0.000000 | 0.000000 |
PHYSICS | 物理 | 0.448099 | 0.448099 | 0.048224 | 0.048224 |
MUSIC | 音乐 | 0.444582 | 0.444582 | 0.075960 | 0.075960 |
ASTRONOMY AND ASTROPHYSICS | 天文学与天体物理学 | 0.441356 | 0.441356 | 0.021167 | 0.021167 |
ELECTRICAL ENGINEERING | 电机工程 | 0.437847 | 0.437847 | 0.059174 | 0.059174 |
MEDICAL TECHNOLOGIES TECHNICIANS | 医疗技术人员 | 0.434298 | 0.434298 | 0.036983 | 0.036983 |
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … | 核,工业放射学和生物… | 0.430537 | 0.430537 | 0.071540 | 0.071540 |
PHYSICAL SCIENCES | 物理科学 | 0.426924 | 0.426924 | 0.035354 | 0.035354 |
SCIENCE AND COMPUTER TEACHER EDUCATION | 科学与计算机教师教育 | 0.423209 | 0.423209 | 0.047264 | 0.047264 |
GENERAL BUSINESS | 一般业务 | 0.417925 | 0.417925 | 0.072861 | 0.072861 |
PHILOSOPHY AND RELIGIOUS STUDIES | 哲学与宗教研究 | 0.416810 | 0.416810 | 0.096052 | 0.096052 |
MISCELLANEOUS FINE ARTS | 其他美术 | 0.410180 | 0.410180 | 0.089375 | 0.089375 |
COSMETOLOGY SERVICES AND CULINARY ARTS | 美容服务和烹饪 | 0.383719 | 0.383719 | 0.055677 | 0.055677 |
MARKETING AND MARKETING RESEARCH | 市场营销研究 | 0.382900 | 0.382900 | 0.061215 | 0.061215 |
MECHANICAL ENGINEERING RELATED TECHNOLOGIES | 机械工程相关技术 | 0.377437 | 0.377437 | 0.056357 | 0.056357 |
COMMERCIAL ART AND GRAPHIC DESIGN | 商业艺术与图形设计 | 0.374356 | 0.374356 | 0.096798 | 0.096798 |
SPECIAL NEEDS EDUCATION | 特殊需求教育 | 0.366177 | 0.366177 | 0.041508 | 0.041508 |
FINANCE | 金融 | 0.355469 | 0.355469 | 0.060686 | 0.060686 |
ARCHITECTURAL ENGINEERING | 建筑工程 | 0.350442 | 0.350442 | 0.061931 | 0.061931 |
INDUSTRIAL AND MANUFACTURING ENGINEERING | 工业与制造工程 | 0.343473 | 0.343473 | 0.042876 | 0.042876 |
CONSTRUCTION SERVICES | 建筑服务 | 0.342229 | 0.342229 | 0.060023 | 0.060023 |
CHEMICAL ENGINEERING | 化学工程 | 0.341631 | 0.341631 | 0.061098 | 0.061098 |
ECONOMICS | 经济学 | 0.340825 | 0.340825 | 0.099092 | 0.099092 |
ENGLISH LANGUAGE AND LITERATURE | 英语语言与文学 | 0.339671 | 0.339671 | 0.087724 | 0.087724 |
ELECTRICAL ENGINEERING TECHNOLOGY | 电气工程技术 | 0.325092 | 0.325092 | 0.087557 | 0.087557 |
GEOLOGICAL AND GEOPHYSICAL ENGINEERING | 地质与地球工程 | 0.324838 | 0.324838 | 0.075038 | 0.075038 |
OPERATIONS LOGISTICS AND E-COMMERCE | 运营物流与电子商务 | 0.322222 | 0.322222 | 0.047859 | 0.047859 |
TRANSPORTATION SCIENCES AND TECHNOLOGIES | 交通科学与技术 | 0.321296 | 0.321296 | 0.072725 | 0.072725 |
BIOLOGICAL ENGINEERING | 生物工程 | 0.320784 | 0.320784 | 0.087143 | 0.087143 |
MATERIALS SCIENCE | 材料科学 | 0.310820 | 0.310820 | 0.023043 | 0.023043 |
COMMUNICATIONS | 通讯方式 | 0.305109 | 0.305109 | 0.075177 | 0.075177 |
SQL is a powerful language for accessing data and we hope you got a taste for it in this post If you’d like to learn more, we encourage you to check out the , from which this blog post is based on. In the course, we dive into how to:
SQL是一种用于访问数据的强大语言,我们希望您能从本文中受益匪浅。如果您想了解更多信息,我们鼓励您阅读本博客文章所基于的 。 在课程中,我们将深入探讨如何:
翻译自:
sql基础
转载地址:http://ejqwd.baihongyu.com/