# 中文排版风格指南

## 文案风格

• 文章发布之前，建议由多人检查，确保没有错别字。
• 中文流行语中有很多的谐音、错别字，如果文案本身风格偏正式，强烈不推荐使用。比如「墙裂」、「童鞋」等。
• 文案本身要简练。在不影响表达效果的前提下，精炼的语句能使读者更容易抓住文章的要点。

## 空格

「有研究显示，打字的时候不喜欢在中文和英文之间加空格的人，感情路都走得很辛苦，有七成的比例会在 34 岁的时候跟自己不爱的人结婚，而其余三成的人最后只能把遗产留给自己的猫。毕竟爱情跟书写都需要适时地留白。

## 英文大小写

iOS IOS, ios, Ios
macOS Mac
Python py, python
PHP php
HTML 5 H5
Facebook FB
AMD Amd
Google gg, google

## 额外的小点

### 简体中文使用直角引号

「老师，『有条不紊』的『紊』是什么意思？」

“老师，‘有条不紊’的‘紊’是什么意思？”

# 小工具 | Moodle快捷助手 & Essay句长检查器发布！

Moodle Helper and Essay Sentence Length Checker Published!
(For English version, please scroll down.)

• 没人通知 Moodle 上传了新东西，每天只好一遍遍查看每个课程页？用
Moodle Helper，在首页，一切动态尽收眼底！
• Quiz 答题时，每一题都要“点选项 + 点 Check + 滑到下一题”，嫌操作太麻烦？用 Moodle Helper，只需轻按键盘上的 ABCD！
• EBP 的 Essay 要求每句话不超过 9（或 8）个单词，不小心超了两句没发现？用 EssaySentenceLengthChecker，帮你标出不合要求的句子！

• 智能打印：点一下moodle上的ppt，系统就会进行智能排版（如1面纸8页紧凑的ppt、双面打印……），并自动发送到打印机，省纸又方便。（另：貌似学校90%的人不知道ppt是可以紧凑地不留白地打印到一张纸上的哦）
• 智能同步：自动将moodle上的课件同步到你的电脑文件夹，不必每次再下载新课件并整理到文件夹中。

• 为什么没有 Safari 版：注册费用要一年 99 刀，如果有好心人给一个账号我立马开发！
• 只能使用 Chrome 浏览器吗：360 浏览器、QQ 浏览器可能也可以。
• 程序是否安全：Moodle Helper完全开源，如觉得不安全，可查看源代码，也欢迎随时贡献新的代码和功能！
• 开源地址：https://github.com/turtlegood/CUHKSZHelper

MoodleHelper 的一些应用场景

• 想要提前预习，但是不知道教授什么时候才会更新 ppt，只好一遍又一遍地查看每个科目主页？
• 教授喜欢把文件上传到文件夹里，因此得经常打开各个文件夹看看有没有新东西？
• 新的 assignment 又神不知鬼不觉地出现了，根本没注意到课程页面下面多了一行？

……

( Here is the English version. )

Have you ever…

• Nobody informs you of the uploaded files on moodle, so everyday you have to check every webpage of courses and folders? Using MoodleHelper, you can see everything just on the frontpage!
• When doing quizzes, you feel tired to “click the right answer + click check button + scroll to the next question” for every question? Using MoodleHelper, all you need to do is press the ABCD on your keyboard!
• EBP Essay requires less than 9 (or 8) words in a sentence, but you sometimes didn’t follow it by accident, and didn’t notice that? Use EssaySentenceLengthChecker, and it will mark every sentence of that kind!

I have currently developed an extension called “Moodle Helper”. And I also developed a tool
called “EssaySentenceLengthChecker” about EBP courses. Both are in the hope that it will be more convenient for everyone.

Installation:

There will be a cooperation with ITSO department. So here are the functions that may be added in the future:

• Intelligent Printing: You just need to click the ppt on Moodle. Then the system will compose and arrange automatically (e.g. print 8 slides tightly onto one piece of paper, print both sides of paper, …), and then send it to the printer automatically. So I think it will be convenient and will save paper. (P.S. It seems that 90% people here don’t know that ppt can be printed to a piece of paper very tightly without lots of blank)
• Intelligent Synchronize: Automatically sync the files on Moodle to the folders in your computer. So you will never need to download new files and put them into corresponding folders.

Some common questions:

• Why there is no Safari version: The registry costs 99 dollar per year, so if any warm-hearted man can give me an account, I will develop it immediately!
• Is it only available on Chrome: 360 Browser and QQ Browser may be available too.
• Is the software safe: Moodle Helper is a fully open-sourced software. So if you think it is not safe, you can just see the source code. And you are welcome to contribute new functions and codes to this project!
• Open source: https://github.com/turtlegood/CUHKSZHelper

Use cases of MoodleHelper:

• Want to preview ppts, but you don’t know when the professor will upload their ppts, so you have to refresh and see every course page on Moodle over and over again?
• The professor likes to upload files into folders, so you have to open every one of them to see whether there are new things?
• A new assignment appears “mysteriously” without notification, and you don’t notice the tiny new line appended on the course page?

This is just a small tool I made in my spare time, so there inevitably exists some insufficiency and please excuse that in my tool. If you have other questions, you are welcome to contact me at any time! (The contact information is listed below.)

Developer: Chen Jingyi (Tom)

Contact me: fzyzcjy (WeChat), 117010016 (Email)

# “核战”看今朝——伪装机党观近日 AMD 大战英特尔

Ryzen 7 对标酷睿 i7，而其不对称优势大到令人瞠目结舌。持平的性能，仅有英特尔一半多的功耗，已经足以令发烧友心动。令人无比震惊的是 Ryzen 在价格的绝对优势：8 核 16 线程的 Ryzen 7 1700，国行开售价格仅 2400，甚至低于 4 核 8 线程、多核性能明显低于其的 i7-7700k；8 核 16 线程的旗舰级 CPU 1800x，国行开售时售价为 3999 元，而同为 8 核 16 线程，性能与之持平，i7-6900k 的价格却是其两倍有余！ 在 AMD 的发布会上，当 1800x 的价格一出现在大屏幕是，现场的欢呼就已经难以抑制。

P.S: Ryzen 7 的官方发布会：http://www.bilibili.com/video/av8777517

AMD 的后劲比英特尔想的足很多。与 i9 对标的 Ryzen Threadripper（线程撕裂者）紧随 i9 传出消息（坊间看做 AMD 接下了英特尔的“战书”），也和 i9 同期在台北电脑展上发布。

AMD 似乎一点都没有顾及当年的恩怨，在英特尔传出i9的消息后就果断地以 Threadripper 作出了回应。这样的结果可谓是令人瞠目：根据披露的图片，巨大的 Threadripper 大到可以直接覆盖一个一个成年男子手掌除了手指的部分，相应的主板上 CPU 接口加上 8 个内存接口轻易占据了整块主板一半的面积。

http://www.mykancolle.com/?post=1630 ），大幅改进了效率：据称使得多一倍的线程可以达到多近 100% 性能的水平。反而由于把两个 CPU 封装在一起，CPU 比原来拥有了多一倍的线程数、多一倍的缓存和多一倍的的通道数（PCI-e 通道和内存通道）以至于一些纸面数据还好于英特尔的 i9，让 CPU 具有了优势。

Threadripper 的出现不仅意在对抗 i9，同时也意在填补 Ryzen 7 和 AMD 服务器 CPU 之间的空白。基于 Zen 架构，AMD 的 Naples（那不勒斯）系列（属于 AMD 新子品牌 EPYC）意在对抗英特尔的 XEON（至强）。和 Threadripper 类似，EPYC 是 4 颗 CPU 封装在一起的产物，于是其最多可以达到 32 核 64 线程（一如既往地靠多核弥补主频）。至此，除笔记本用 CPU，AMD 和英特尔的“核战”几乎由最低端的领域一路延伸到最高端的领域。

P.S: AMD和英特尔的恩怨情仇：http://tech.huanqiu.com/news/2017-03/10398639.html

# 2017 香港中文大学（深圳）大数据分析挑战赛赛题

1. 比赛背景

在电子商务蓬勃发展的今天，工业界十分重视电商平台上用户的评论并以此为指导改善产品以达到更高的利润。自然语言处理 (NLP) 是现今人工智能领域一大重点研究方向。在本次比赛中，选手需要开发一整套针对特定品牌，特定产品评论的分析算法，并给出该评论在十个给出的品类下的情感分析。

2. 数据介绍

数据中包括 101480 条数据，其正文是电商平台中关于洗发水的评论，每条数据为12列，按顺序为

其中对于type的值，“0” 表示没有出现这个type， “1” 表示出现这个type且为正面，“-1”表示出现了这个type且为负面。

各个type的解释如下：

• Price: 表示评论与价格相关，是否划算，是否便宜等。
• Fakeconcern: 表示评论与商品是否为正品相关。
• Promotion：表示评论中与商品促销相关，如出现“活动”字眼等。
• Service: 表示评论与商家服务相关。
• Leakage: 表示评论与洗发水是否泄露相关。
• Package: 表示评论与商品包装相关。
• Loyalty: 表示评论与用户忠诚度相关，其中若评论提及“多次购买”，“第二次购买”等等字眼则判为“1”，如“再也不买”的字眼则判定为“-1”。
• Smell: 表示评论与气味相关。
• Effect：表示评论与洗发水功效相关。
• Logistics: 表示评论与物流相关。

例子：

如 Id 为11的评论：

“跟超市搞活动时价格差不多，这次礼盒多了5小瓶赠品很合算。”

“Price” 即为1， “Promotion”也为1。

3. 评价方式

对于每个type，是否猜中，正面，负面各有一个f1值，所以总共有30个f1值。

其中对商家来说，负面的评论意义更大，故所有负面的f1值在计算总分时会乘以1.4，对于type来说，Smell, Effect, Package 和 Logistics 这四个type的意义对商家也更重要，所以其 f1 值在计算总分时将乘以1.25。

其中每一项的 $F1$ Score 为：

其中准确率 $Precision$:

召回率 $Recall$:

# 2017 香港中文大学（深圳）大数据分析挑战赛

1. 比赛主题
本届大数据分析挑战赛的主题为：消费品行业中的数据分析

2. 主办单位：香港中文大学（深圳）计算机协会（Computer @nd Comity)

3. 协办单位：一面网络技术有限公司 （yimian.com.cn）

4. 比赛目的及意义

随着移动设备的完善和普及，移动互联网+各行各业进入了高速发展阶段，这其中以 O2O(Online to Offline)消费最为吸引眼球。据不完全统计，O2O行业估值上亿的创业公司至少有 10 家，也不乏百亿巨头的身影。O2O 行业天然关联数亿消费者，各类 APP 每天记录了超过百亿条用户行为和位置记录，因而成为大数据科研和商业化运营的最佳结合点之一。

香港中文大学（深圳）大数据分析挑战赛由香港中文大学（深圳）计算机协会主办，一面网络技术有限公司协办，是面向全校学生的高端算法竞赛。通过开放由一面网络技术有限公司提供的海量电商评论和销量数据，大赛让所有参与者有机会运用自己设计的算法。

本次大赛目的在于提升同学们对大数据的认识与理解，使得同学在比赛中学习、提高大数据分析能力，为今后的学习和工作提供宝贵的经验。

5. 比赛赛题方向

如今的电商平台中存在大量的商品评论，作为商家和数据分析者，希望提取其中的信息点来发现商业价值。你的任务是对商品评论进行类别和情感层面上的分类。

训练集在赛事初发放。测试集发放分为两次，第一次发放约5w条数据，参赛者可以不限次数提交结果，但只在每日中午十二点返回最近一次的结果评测。第二次测试集发放在比赛截止前一晚，发放约5w条数据，参赛者可在最后一天无限次提交结果，最后以当夜 23:59 前最后一次提交的结果为准。

为了方便对此有兴趣的同学参与比赛，计算机协会将会提供基础的数据分析指导，帮助大家完成自己第一次大数据分析。我们相信所有人都能从本次比赛获得宝贵的知识和经验。

数据格式：

1. 训练集（案例）包括如下字段：评论 ID、评论内容、类别1、类别2、类别3、情感
2. 测试集（案例）：评论 ID、评论内容
3. 提交结果（案例）：评论 ID、类别1、类别2、类别3、情感

评测指标：

对于每个类别和情感都可以得到一个f1-score, 最终总评为各个f1-score的加权

6. 比赛评分

比赛的成绩分为两个部分：

• 对于销量的预测准确率 60%
• 所用模型和方法答辩 40%

线上测试开放后，每一位参赛队员每一日可提交一次，最终成绩取历次成绩中最好的一次。线上测试关闭后参赛选手需参与答辩，否则将没有第二部分成绩。

7. 比赛赛程（可能会根据实际情况有微小变动）

1. 报名：1月22号截止

组队参赛，每个选手只能参与一支队伍，每组队员不多于3人（特殊情况可以提交申请，视情况放宽）。

未找到队伍的同学可以个人报名，可以选择是否接受与其他单人参赛同学随机组队。

选手需通过 Google 表单进行报名，报名成功后会有邮件提示报名成功。

2. 数据发布：1月22号至1月24日

数据发布后参赛选手就可以着手分析、编写脚本

3. 线上测试开放：2月11日

4. 线上测试关闭：3月9日

5. 现场答辩：3月12日

每队有五分钟时间陈述所用模型与处理方法，每队有五分钟的 Q&A 时间。现场颁奖。

8. 其他信息请咨询计算机协会，email：[email protected]

# 参数范数正则化

## $L^1$参数正则化

$L^1$正则化在一般的情况下无法得到干净的解析表达式，我们进一步假设这个海森矩阵是对角的且每个对角元素大于零（在用于线性回归的数据已经经过类似 PCA 预处理，特征之间没有相关性得情况下，这一假设是成立的）。将$L^1$正则化的目标函数的二次近似分解为关于参数的求和

1. $\omega_i^\ast\leq\frac{\alpha}{H_{ii}}$ 的情况，正则化后给出的最优值是$\omega_i=0$。这是因为在方向$i$上$J(\omega;X,y)$对于 $\tilde{J}(\omega;X,y)$的贡献受到压制，$L^1$正这话将这个参数推向零
2. $\omega_i^\ast>\frac{\alpha}{H_{ii}}$ 时，正则化仅仅在那个方向上移动$\frac{\alpha}{H_{ii}}$ 的距离

$\omega_i^\ast$的情况与之类似，但是正则化使$\omega_i^\ast$更接近0或者为0.

## MAP 贝叶斯推断正则化

1. $L^2$正则化相当于权重是高斯先验的 MAP 贝叶斯推断。

当我们对参数引入协方差为$\alpha$的零均值高斯先验 $p(\omega_j)=\frac{1}{\sqrt{2\pi\alpha}}\exp(-(\omega^{(j)})^2/2\alpha)$，其最大后验估计

取其对数

最大似然后 $\omega=\mathrm{argmin}_\omega(\frac{1}{n}|y-\omega^TX|_2+\alpha|\omega|_2)$ ，提取其目标函数，与$L^2$正则化形式类似。

2. $L^1$正则化相当于是权重为拉普拉斯先验的 MAP 贝叶斯推断

# Linear Model (1) - Linear Regression

### Maximize Likelihood Linear Regression

Suppose we have data set $S={(x^{(i)},y^{(i)}),i=1,\dots,m}$ where $x^{(i)}\in\mathbb R^n$ such that x has $n$ features with $m$ training examples. Let us assume that the target variables and the inputs are related via a linear equation.

Where $\epsilon^{(i)}$ is an error term that captures either un-model effects or random noise. Let’s assume that the $\epsilon^{(i)}$’s are distribute i.i.d.(independently and identically distributed) according to Gaussian Distribution with mean zero and variance $\sigma^2$. Which can be written as $\epsilon^{(i)}\sim N(0, \sigma^2)$. And the pdf of $\epsilon^{(i)}$ is given by

Because of $\epsilon^{(i)}=y^{(i)}-\theta^Tx^{(i)}$, the pdf also can be given as

Notice that the notation ‘$p(y^{(i)}|x^{(i)};\theta)$’ indicates that this is the distribution of $y^{(i)}$ given $x^{(i)}$ is parameterized by $\theta$ and $\theta$ is not a random variable, the formula is not a probability consition on $\theta$. We can write the distribution as ‘$y^{(i)}|x^{(i)};\theta\sim N(\theta^Tx^{(i)},\sigma^2)$’. Given an input matrix $X=(x^{(1)},x^{(2)},\dots,x^{(m)})^T$ and $\theta$, what the distribution of $y^{(i)}$’s is given by $p(\overrightarrow{y}|X;\theta)$. When we wish to explicity view this as a function of $\theta$, we call it the likelihood function:

Note that by the independence assumption on the $\epsilon^{(i)}$’s, this can be written by

Now, given this probabilistic model relating the $y^{(i)}$’s and the $x^{(i)}$’s. The principal of maximum likelihood says that we should should choose $\theta$ so as to make the data as high probability as possible. So We are facing an optimization problem.

We define a new likelihood function called log likelihood:

And the maximization problem $\max_{\theta}\ell(\theta)$ become a minimization problem:

This is our original least-squares cost function. Under the previous probabilistic assumptions on the data, least-squares regression corresponds to finding the maximum likelihood estimate of $\theta$.

Back to over Linear Regression problem, assume we have data set $S={(x^{(i)},y^{(i)}),i=1,\dots,m}$ , and our hypothesis $h_{\theta}(x)=\theta^Tx$ (we set $x_0$ to be $1$ so that the constant $\theta_0$ could be include into $\theta$ and thus $x_{(i)}\in\mathbb R^{(n+1)}, \theta\in\mathbb R^{(n+1)}$). Notice that in our hypothesis, the $\theta$ is not the population parameter, this is the parameter we are going to estimate by maximize likelihood function. According to the probability analysis above, we define the cost function

And the linear regression problem can be express as this optimization problem

Let’s rewrite this problem with a simple form by matrix, denote $X=(x^{(1)},x^{(2)},\dots,x^{(m)})^T$, $y=(y^{(1)},y^{(2)},\dots,y^{(m)})$ where $x^{(i)}\in\mathbb R^{(n+1)},y\in\mathbb R^{(n+1)}$ and $\theta\in\mathbb R^{(n+1)}$. So the problem can be expressed by

And the linear model of inputs $X$ can be shown as $y=\theta^Tx$, where $\theta=\arg \min_\theta |X\theta-y|^2_2$. We will solve this question latter.

### Least Square Linear Regression

If we want to build a model which can fit the sample data with least error, a simple way is to make the different between the estimator and the samples to be smallest in some form. In the least square linear regression, we optimize the square of the errors. Suppose we have hypothesis $h_\theta=X\theta$ (in the form of matrix, where $X\in\mathbb R^{m*(n+1)}$ and $\theta\in\mathbb R^{(n+1)}$, $x^{(i)}_0=1$ for all the inputs to make $\theta_0$ to be constant), the problem can be expressed in the form

This is same with the maximize likelihood regression in expression but they have differences. In maximize likelihood estimation we have an important assumption that all the samples $x^{(1)}, x^{(2)},\dots,x^{(m)}$ are i.i.d. The $\theta$ is the parameter of the model, $h$ is our model or our hypothesis. With regard to maximizing likelihood regression, the most reasonable estimation should be the one which makes the probability of $n$ samples extracted from the model observe those $y$’s maximum. But for least square, the most reasonable estimation is the one which can fit the samples best(the minimum of the square of error). It is clear that those regression method are come from different idea.

When we employ maximize likelihood regression, we need to know the probability distribution of errors or the hypothesis. In general, we assume that the distribution is Gaussian. Under this assumption, maximize likelihood regression is equivalent to least square regression.

For least square, we can also try to understand it in the form of geometry. Suppose we have a vector space $N\in\mathbb R^{(n+1)}$ and all the input sample $x^{(i)}$ is in this space. In this regression problem, the real thing we want is to find one way that hold $x^{(i)}\theta=y^{(i)}$ for all $m$ samples, which can be written as $X\theta=y$. But in real world problems the most likely situation is the number of samples $m$ is larger than the features $n$ which indicates that the equation $X\theta=y$ will be overdetermined. Assume the solution of this least square problem is $\theta’$ and $X\theta’$ is the orthogonal projection $y$ project to the space spanned by $X$’s column vector. And in linear algebra the orthogonal projection from $y$ to $X\theta’$ is $X(X^T X)^{-1}X^Ty$. The solution can be expressed by $\theta’=(X^T X)^{-1}X^Ty$ easily. We will see that this is same with the solution we solve by taking the differential later.

### Solve $\min_\theta |X\theta-y|^2_2$

Except the least square solution of the overdetermined system, we can solve this by other way. With respect to $|X\theta-y|^2_2$ we have $J(\theta)=(X\theta-y)^T(X\theta-y)$ and the optimization problem will be

We take the derivative

And set it to zero

We can get the solution

# Week One - Python in CUHK(SZ)

## Brief Introduction to Computer Programming

### Why programming?

Computer is built to help people solve problems, but computer does not understand what we say.
So we need to communicate with computers using their languages (computer programming language)

### Components in a computer

1. Central processing unit (CPU): execute your program. Similar to human brain, very fast but not that smart
2. Input device: take inputs from users or other devices
3. Output device: output information to users or other devices
4. Main memory: store data, fast and temporary storage
5. Secondary memory: slower but large size, permanent storage

### What can a computer actually understand

The computers used nowadays can understand only binary number (i.e. 0 and 1)。
Computers use voltage levels to represent 0 and 1
The instructions expressed in binary code is called machine language

#### Low level language – Assembly Language

An assembly language is a low-level programming language, in which there is a very strong (generally one-to-one) correspondence between the language and machine code instructions.
Each assembly language is specific to a particular computer architecture
Assembly language is converted into executable machine code by a utility program referred to as an assembler

#### High-level languages: C, C++, Java, Python…

High level languages cannot be executed directly
High level languages must be converted into low level languages first
Lower level languages have higher language efficiency (they are faster to run on a computer)
Higher level languages have higher development efficiency (it is easier to write programs in these languages)

### Memory and addressing

A computer’s memory consists of an ordered sequence of bytes for storing dataEvery location in the memory has a unique address

The key difference between high and low level programming languages is whether programmer has to deal with memory addressing directly

### Operating Systems

The operating system (OS) is a low level program, which provides all basic services for managing and controlling a computer’s activities

Applications are programs which are built based upon an OS

Main functions of an OS:

• Controlling and monitoring system activities
• Allocating and assigning system resources
• Scheduling operations

Popular OS: Windows, macOS, Linux, iOS, Android (a kind of Linux)…

### The units of information (data)

• Bit (比特/位): a binary digit which takes either 0 or 1
Bit is the smallest information unit in computer programming
• Byte (字节): 1 byte = 8 bits, every English character is represented by 1 byte
• KB (千字节):1 KB = 2^10 B = 1024 B
• MB (兆字节):1MB = 2^20 B = 1024 KB
• GB (千兆字节):1GB = 2^30 B = 1024 MB
• TB (兆兆字节):1TB = 2^40 B = 1024 GB

## Number Systems

A numeral system (or system of numeration) is a writing system for expressing numbers; that is, a mathematical notation for representing numbers of a given set, using digits or other symbols in a consistent manner.

Each positional number system contains two elements, a base and a set of symbols. Using the decimal system as an example, its base is 10 and the symbols are, of course, numbers.

Commonly, decimal number system, binary number system and hexadecimal number system are used in computer.

### Demical Number System

In the decimal number system, the base is 10, the symbols include 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Every number can be decomposed into the sum of a series of numbers, each is represented by a positional value times a weight

𝑎𝑛 is the positional value (ranging from 0 to 9), while 10𝑛 represents the weight

### Binary Number System

Base 2, symbols 0 and 1

𝑎𝑛 is the positional value (ranging from 0 to 1), while 2𝑛 represents the weight.

### Hexadecimal number system

In the hexadecimal system, the base is 16, we use 16 symbols {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c,d, e, f}• “10” is used when we hit 16 (逢十六进一)•

𝑁=𝑎𝑛×16𝑛+𝑎𝑛−1×16𝑛−1+𝑎𝑛−2×16𝑛−2 ……+𝑎0×160+𝑎−1×16−1+𝑎−2× 16−2 …

𝑎𝑛 is the positional value (ranging from 0 to 15), while 16𝑛 represents the weight.

### Number System Conversion

There are many methods or techniques which can be used to convert numbers from one base to another.

#### Decimal to Other Base System

• Step 1 − Divide the decimal number to be converted by the value of the new base.
• Step 2 − Get the remainder from Step 1 as the rightmost digit (least significant digit) of new base number.
• Step 3 − Divide the quotient of the previous divide by the new base.
• Step 4 − Record the remainder from Step 3 as the next digit (to the left) of the new base number.

Repeat Steps 3 and 4, getting remainders from right to left, until the quotient becomes zero in Step 3.

The remainders have to be arranged in the reverse order so that the first remainder becomes the Least Significant Digit (LSD) and the last remainder becomes the Most Significant Digit (MSD).

##### Example

Decimal Number: 29 -> Binary Equilvalent

Step Operation Result Remainder
Step 1 29 / 2 14 1
Step 2 14 / 2 7 0
Step 3 7/2 3 1
Step 4 3/2 1 1
Step 5 1/ 2 0 1

Binary Number: 11101

#### Other Base System to Decimal System

• Step 1 − Determine the column (positional) value of each digit (this depends on the position ofthe digit and the base of the number system).
• Step 2 − Multiply the obtained column values (in Step 1) by the digits in the corresponding columns.
• Step 3 − Sum the products calculated in Step 2. The total is the equivalent value in decimal.
##### Example

Binary Number: 11101 -> Decimal Equivalent

#### Decimal Fraction to Binary

You can convert a decimal fraction to binary by repeatedly multiplying the fractional results of successive multiplications by 2. The carries form the binary number.

### How a program runs?

A computer doesn’t actually understand the phrase ‘Hello, world!’, and it doesn’t know how to

display it on screen. It only understands on and off. So to actually run a command like print ‘Hello, world!’, it has to translate all the code in a program into a series of ons and offs that it can understand.

To do that, a number of things happen:

• The source code is translated into assembly language.
• The assembly code is translated into machine language.
• The machine language is directly executed as binary code.

Confused? Let’s go into a bit more detail. The coding language first has to translate its source code into assembly language, a super low-level language that uses words and numbers to represent binary patterns. Depending on the language, this may be done with an interpreter (where the program is translated line-by-line), or with a compiler (where the program is translated as a whole).

The coding language then sends off the assembly code to the computer’s assembler, which converts it into the machine language that the computer can understand and execute directly as binary code.

Interpreter (解释器) is a computer program that directly executes, i.e. performs, instructions written in a programming or scripting language, without previously compiling them into a machine language program

A compiler (编译器) is a computer program (or a set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language), with the latter often having a binary form known as object code

## Introduction to Python

### What & Why is Python?

Python is a widely used high-level, general-purpose, interpreted, dynamic programming language. Its
design philosophy emphasizes code readability, and its syntax allows programmers to express
concepts in fewer lines of code than would be possible in languages such as C++ or Java. The
language provides constructs intended to enable clear programs on both a small and large scale.

Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. It features a dynamic type system and automatic memory management and has a large and comprehensive standard library.

Python interpreters are available for many operating systems, allowing Python code to run on a wide variety of systems. Using third-party tools, such as Py2exe or Pyinstaller, Python code can be packaged into stand-alone executable programs for some of the most popular operating systems, so Python-based software can be distributed to, and used on, those environments with no need to install a Python interpreter.

CPython, the reference implementation of Python, is free and open-source software and has a community-based development model, as do nearly all of its variant implementations.

### Install Python 3

In this course, we will use Python 3.5. Before we formally start the course, Python 3 must be installed in your computer first.

If you do not know how to use PowerShell on Windows, Terminal on OS X or bash on Linux then you need to go learn that first.

Why Python 3 not 2? Check the difference https://wiki.python.org/moin/Python2orPython3 ]

Windows Users:

1. Open https://www.python.org/downloads/ in your browser
2. Click Download Python 3.5.2, download Windows x86 executable installer
3. Install

macOS Users (Recommended):

1. Open your Terminal
2. Type in the command shown here: http://brew.sh/
3. Follow the instruction to install Homebrew, including xcode-command
4. After you installed it, type in brew install python3 virtualenv
5. Type in python3 -V, if it shows Python 3.5.2 then everything is done.

Linux Users:

1. Use your package manager to install Python 3

### Python Syntax

Python uses indentations to identify different program blocks. Here we show a simple example of Python script.

Can you guess what will be on display?

Hello World Computer @nd Comity
2016-10-13 15:22:24.510837


Why? We will explain it in future courses.

## IDE

What is IDE? An integrated development environment (IDE) is a software application that provides comprehensive facilities to computer programmers for software development. An IDE normally consists of a source code editor, build automation tools and a debugger. Most modern IDEs have intelligent code completion.

Here, we recommend you to use PyCharm when you believe that you master Python. It is a commercial software by JetBrains. Shall we pay for it? No. As a student, we can enjoy the educational promotion.

Detailed information will not be shown here.

1. Obtain JetBrains Education promotion here: https://www.jetbrains.com/students
2. Download PyCharm here: https://www.jetbrains.com/pycharm

## Basic Python

### Comments

Anything after a “#” is ignored by Python.

Why comment?

•  Describe what is going to happen in a sequence of code
•  Document who wrote the code and other important information
•  Turn off a line of code – usually temporarily

### Variable

A variable is something that holds a value that may change. In simplest terms, a variable is just a
box that you can put stuff in. You can use variables to store all kinds of stuff, but for now, we are just
going to look at storing numbers in variables.

#### Rules for defining variables in Python

Must start with a letter or underscore , Can only contain letters, numbers and underscore, Case sensitive

##### Reserved words

The following identifiers are used as reserved words, or keywords of the language, and cannot be used as ordinary identifiers. They must be spelled exactly as written here:

False class finally is return None continue
for lambda try True def from nonlocal
while and del global not with as
elif if or yield assert else Import
pass break except in raise

#### Assign a variable

##### Multiple Assignment

Python allows you to assign a single value to several variables simultaneously. For example −

Here, an integer object is created with the value 1, and all three variables are assigned to the samememory location. You can also assign multiple objects to multiple variables. For example −

Here, two integer objects with values 1 and 2 are assigned to variables a and b respectively, and onestring object with the value “john” is assigned to the variable c.

#### Extensive Knowledge

When you assign to a variable you are binding the name to an object. From that point onwards you can refer to the object by using the name, until that name is rebound.

In the first example the name i is bound to the value 5. Binding different values to the name jdoes not have any effect on i, so when you later print the value of i the value is still 5.

In the second example you bind both i and j to the same list object. When you modify the contents of the list, you can see the change regardless of which name you use to refer to the list.

Note that it would be incorrect if you said “both lists have changed”. There is only one list but it has two names (i and j) that refer to it.

Reference: http://stackoverflow.com/questions/13530998/python-variables-are-pointers

### Operators

We can easily do numeric operations in Python — actually you can take it as a simple calculator!

#### Basic mathematic operators

Operator Description
+ add
- subtract
* multiply
/ divide
** Exponentiation
( ) parentheses
// floor division
% modulo, find the remainder

#### Operator precedence

Highest to lowest precedence rule

• Parenthesis are always with highest priority
• Power
• Multiplication, division and remainder
• Addition and subtraction
• Left to right

#### Logical operators

Logical operators can be used to combine several logical expressions into a single expression

Python has three logical operators: not, and, or

#### Comparison operators

Boolean expressions ask a question and produce a Yes/No result, which we use to control program flow. Boolean expressions use comparison operators to evaluate Yes/No or True/False.

Comparison operators check variables but do not change the values of variables.

Operators Description
x < y Is x less than y?
x <= y Is x less than or equal to y?
x == y Is x equal to y?
x >= y Is x greater than or equal to y?
x > y Is x greater than y?
x != y Is x not equal to y?

Careful!! “=“ is used for assignment

### Indentation

• Increase indent: indent after an if or for statement (after :)
• Maintain indent: to indicate the scope of the block (which lines are affected by the if/for)
• Decrease indent: to back to the level of the if statement or for statement to indicate the end of the block
• Blank lines are ignored – they do not affect indentation
• Comments on a line by themselves are ignored w.r.t. indentation

### Evaluate

The eval() function takes a string argument and evaluates that string as a Python expression, i.e., just as if the programmer had directly entered the expression as codeThe function returns the result of that expression.

eval() gives the programmers the flexibility to determine what to execute at run-time.

One should be cautious about using it in situations where users could potentially cause problems with “inappropriate” input.

### Data Type

#### Numbers

Python supports four different numerical types −

• int (signed integers)
• float (floating point real values)
• complex (complex numbers)

Please be noted that, if you do something like int(1.23), no exception will be raised. Instead, it will return an int object with value 1. The int class do the converion — assign the integer only.

##### Floating point real values

Floating-point numbers (float type) are numbers with a decimal point or an exponent (or both). Examples are 5.0, 10.24, 0.0, 12. and .3. We can use scientific notation to denote very large or very small floating-point numbers, e.g. 3.8 x 10^15. The first part of the number, 3.8, is the mantissa and 15 is the exponent. We can think of the exponent as the number of times we have to move the decimal point to the right to get to the actual value of the number.

In Python, we can write the number 3.8 x 10^15 as 3.8e15 or 3.8e+15. We can also write it as 38e14or .038e17. They are all the same value. A negative exponent indicates smaller numbers, e.g. 2.5e-3is the same as 0.0025. Negative exponents can be thought of as how many times we have to move the decimal point to the left. Negative mantissa indicates that the number itself is negative, e.g. -2.5e3 equals -2500 and -2.5e-3 equals -0.0025.

Here we should take care that, the implement of float in Python causes the incorrection. float(3.2)is not actually 3.2, but 3.200000000000000123 or something else. It does not matter in normal application, but when you are doing some scientific calculation, you may use some third-party packages to avoid it.

#### List

Lists are the most versatile of Python’s compound data types. A list contains items separated by commas and enclosed within square brackets ([]). To some extent, lists are similar to arrays in C. One difference between them is that all the items belonging to a list can be of different data type.

The values stored in a list can be accessed using the slice operator ([ ] and [:]) with indexes starting at 0 in the beginning of the list and working their way to end -1. The plus (+) sign is the list concatenation operator, and the asterisk (*) is the repetition operator.

For example,

##### Slice Operator

What is a slice operator? The slice operator ([ ] and [:]) is to slice a list (of course).

It’s pretty simple really:

a[start:end] # items start through end-1
a[start:]    # items start through the rest of the array
a[:end]      # items from the beginning through end-1
a[:]         # a copy of the whole array


There is also the step value, which can be used with any of the above:

a[start:end:step] # start through not past end, by step


The key point to remember is that the :end value represents the first value that is not in the selected slice. So, the difference beween end and start is the number of elements selected (if step is 1, the default).

The other feature is that start or end may be a negative number, which means it counts from the end of the array instead of the beginning. So:

a[-1]    # last item in the array
a[-2:]   # last two items in the array
a[:-2]   # everything except the last two items


Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

Reference: http://stackoverflow.com/questions/509211/explain-pythons-slice-notation

##### Loop

We can loop a list using for element in <list>

#### Tuple

A tuple is another sequence data type that is similar to the list. A tuple consists of a number of values separated by commas. Unlike lists, however, tuples are enclosed within parentheses.

The main differences between lists and tuples are: Lists are enclosed in brackets ( [ ] ) and their elements and size can be changed, while tuples are enclosed in parentheses ( ( ) ) and cannot be updated. Tuples can be thought of as read-only lists.

#### String

Strings in Python are identified as a contiguous set of characters represented in the quotation marks. Python allows for either pairs of single or double quotes. Subsets of strings can be taken using the slice operator ([ ] and [:] ) with indexes starting at 0 in the beginning of the string and working their way from -1 at the end.

The plus (+) sign is the string concatenation operator and the asterisk (*) is the multiple concatenation operator.

For example,

s = 'Hello' + 'LGU'  # s = 'HelloLGU'
s = 'A'*10             # s = 'AAAAAAAAAA'


String is essentially a list in Python, or more accurately, is a tuple. If we want to update a string, what can we do?

#### Dictionaries

Python’s dictionaries are kind of hash table type. They work like associative arrays or hashes found in Perl and consist of key-value pairs. A dictionary key can be almost any Python type, but are usually numbers or strings. Values, on the other hand, can be any arbitrary Python object. Amazing, the value can be almost any Python data type too!

Dictionaries are enclosed by curly braces ({ }) and values can be assigned and accessed using square braces ([]). The elements in a dict is unsorted.

Example:

d = {'lgu': 'cuhksz',
'cuhk': 'shatin',
631: ['Guangdong', 'Zhejiang']}
print(d)  # what will be printed?
d[0]  # No suce operator. What is the first element in dict? I don't know. God know.


We can also loop a dictionary.

#### Boolean (bool)

Python contains a built-in Boolean type, which takes two values True/False

Number 0 can also be used to represent False. All other numbers represent True.

Example:

#### Data Type Conversion

Sometimes, you may need to perform conversions between the built-in types. To convert between types, you simply use the type name as a function. Most conversion has been shown above.

### Logic Statements

Reference: http://python-textbok.readthedocs.io/en/latest/Selection_Control_Statements.html

In procedurally written code, the computer usually executes instructions in the order that they appear. However, this is not always the case. One of the ways in which programmers can change the flow of control is the use of selection control statements.

Selection statements allows a program to choose when to execute certain instructions. For example, a program might choose how to proceed on the basis of the user’s input. As you will be able to see, such statements make a program more versatile.

#### Selection: if statement

People make decisions on a daily basis. What should I have for lunch? What should I do this weekend? Every time you make a decision you base it on some criterion. For example, you might decide what to have for lunch based on your mood at the time, or whether you are on some kind of diet. After making this decision, you act on it. Thus decision-making is a two step process – first deciding what to do based on a criterion, and secondly taking an action.

Decision-making by a computer is based on the same two-step process. In Python, decisions are made with the if statement, also known as the selection statement. When processing an ifstatement, the computer first evaluates some criterion or condition. If it is met, the specified action is performed. Here is the syntax for the if statement:

When it reaches an if statement, the computer only executes the body of the statement only if the condition is true. Here is an example in Python, with a corresponding flowchart:

As we can see from the flowchart, the instructions in the if body are only executed if the condition is met (i.e. if it is true). If the condition is not met (i.e. false), the instructions in the if body are skipped.

#### The else clause

An optional part of an if statement is the else clause. It allows us to specify an alternative instruction (or set of instructions) to be executed if the condition is not met:

To put it another way, the computer will execute the if body if the condition is true, otherwise it will execute the else body. In the example below, the computer will add 1 to x if it is zero, otherwise it will subtract 1 from x:

This flowchart represents the same statement:

The computer will execute one of the branches before proceeding to the next instruction.

#### Value vs identity

So far, we have only compared integers in our examples. We can also use any of the above relational operators to compare floating-point numbers, strings and many other types:

When comparing variables using ==, we are doing a value comparison: we are checking whether the two variables have the same value. In contrast to this, we might want to know if two objects such as lists, dictionaries or custom objects that we have created ourselves are the exact same object. This is a test of identity. Two objects might have identical contents, but be two different objects. We compare identity with the is operator:

It is generally the case (with some caveats) that if two variables are the same object, they are also equal. The reverse is not true – two variables could be equal in value, but not the same object.

To test whether two objects are not the same object, we can use the is not operator:

Note: In many cases, variables of built-in immutable types which have the same value will also be identical. In some cases this is because the Python interpreter saves memory (and comparison time) by representing multiple values which are equal by the same object. You shouldn’t rely on this behaviour and make value comparisons using is – if you want to compare values, always use ==.

#### Nested if statements

In some cases you may want one decision to depend on the result of an earlier decision. For example, you might only have to choose which shop to visit if you decide that you are going to do your shopping, or what to have for dinner after you have made a decision that you are hungry enough for dinner.

In Python this is equivalent to putting an if statement within the body of either the if or the else clause of another if statement. The following code fragment calculates the cost of sending a small parcel. The post office charges R5 for the first 300g, and R2 for every 100g thereafter (rounded up), up to a maximum weight of 1000g:

Note that the bodies of the outer if and else clauses are indented, and the bodies of the inner ifand else clauses are indented one more time. It is important to keep track of indentation, so that each statement is in the correct block. It doesn’t matter that there’s an empty line between the last line of the inner if statement and the following print statement – they are still both part of the same block (the outer if body) because they are indented by the same amount. We can use empty lines (sparingly) to make our code more readable.

#### The elif clause and if ladders

The addition of the else keyword allows us to specify actions for the case in which the condition is false. However, there may be cases in which we would like to handle more than two alternatives. For example, here is a flowchart of a program which works out which grade should be assigned to a particular mark in a test:

We should be able to write a code fragment for this program using nested if statements. It might look something like this:

This code is a bit difficult to read. Every time we add a nested if, we have to increase the indentation, so all of our alternatives are indented differently. We can write this code more cleanly using elif clauses:

Now all the alternatives are clauses of one if statement, and are indented to the same level. This is called an if ladder. Here is a flowchart which more accurately represents this code:

The default (catch-all) condition is the else clause at the end of the statement. If none of the conditions specified earlier is matched, the actions in the else body will be executed. It is a good idea to include a final else clause in each ladder to make sure that we are covering all cases, especially if there’s a possibility that the options will change in the future. Consider the following code fragment:

What if we unexpectedly encounter an informatics course, which has a course code of "INF"? The catch-all else clause will be executed, and we will immediately see a printed message that this course code is unsupported. If the else clause were omitted, we might not have noticed that anything was wrong until we tried to use department_name and discovered that it had never been assigned a value. Including the else clause helps us to pick up potential errors caused by missing options early.

#### If statement in one line?

Look back to the previoud example.

It is clear, but it takes 4 lines. Some may say, can I turn it into one line?

Of course you can :)

The syntax is like true-statement if expression else false-statement. It is easy to understand right? One reminder, make your code understandable.

### Loop statement

A loop statement allows us to execute a statement or group of statements multiple times.

Python programming language provides following types of loops to handle looping requirements.

Loop Type Description
while loop Repeats a statement or group of statements while a given condition is TRUE. It tests the condition before executing the loop body.
for loop Executes a sequence of statements multiple times and abbreviates the code that manages the loop variable.
nested loops You can use one or more loop inside any another while, or for or loop.

Loop control statements change execution from its normal sequence. When execution leaves a scope, all automatic objects that were created in that scope are destroyed.
Python supports the following control statements.

Control Statement Description
break statement Terminates the loop statement and transfers execution to the statement immediately following the loop.
continue statement Causes the loop to skip the remainder of its body and immediately retest its condition prior to reiterating.
pass statement The pass statement in Python is used when a statement is required syntactically but you do not want any command or code to execute.

#### While loop statement

A while loop statement in Python programming language repeatedly executes a target statement as long as a given condition is true.

The syntax of a while loop in Python programming language is −

Here, statement(s) may be a single statement or a block of statements with uniform indent. The condition may be any expression, and true is any non-zero value. The loop iterates while the condition is true.
When the condition becomes false, program control passes to the line immediately following the loop.
In Python, all the statements indented by the same number of character spaces after a programming construct are considered to be part of a single block of code. Python uses indentation as its method of grouping statements.

Key point of the while loop is that the loop might not ever run. When the condition is tested and the result is false, the loop body will be skipped and the first statement after the while loop will be executed.

##### The Infinite Loop

A loop becomes infinite loop if a condition never becomes FALSE. You must use caution when using while loops because of the possibility that this condition never resolves to a FALSE value. This results in a loop that never ends. Such a loop is called an infinite loop.

An infinite loop might be useful in client/server programming where the server needs to run continuously so that client programs can communicate with it as and when required.

#### For loop statements.

The for statement in Python has the ability to iterate over the items of any sequence, such as a list or a string.

If a sequence contains an expression list, it is evaluated first. Then, the first item in the sequence is assigned to the iterating variable iterating_var. Next, the statements block is executed. Each item in the list is assigned to iterating_var, and the statement(s) block is executed until the entire sequence is exhausted.

##### The range() function

The built-in function range() is the right function to iterate over a sequence of numbers. It generates an iterator of arithmetic progressions. For details please read the built-in help using help(range).

#### Using else Statement with Loops

Python supports to have an else statement associated with a loop statement.

If the else statement is used with a for loop, the else statement is executed when the loop has exhausted iterating the list.

If the else statement is used with a while loop, the else statement is executed when the condition becomes false.

### Exception

An exception is an event, which occurs during the execution of a program that disrupts the normal flow of the program’s instructions. In general, when a Python script encounters a situation that it cannot cope with, it raises an exception. An exception is a Python object that represents an error.

When a Python script raises an exception, it must either handle the exception immediately otherwise it terminates and quits.

#### Handling an exception

If you have some suspicious code that may raise an exception, you can defend your program by placing the suspicious code in a try: block. After the try: block, include an except: statement, followed by a block of code which handles the problem as elegantly as possible.

Here are few important points about the above-mentioned syntax −

• A single try statement can have multiple except statements. This is useful when the try block contains statements that may throw different types of exceptions.
• You can also provide a generic except clause, which handles any exception.
• After the except clause(s), you can include an else-clause. The code in the else-block executes if the code in the try: block does not raise an exception.
• The else-block is a good place for code that does not need the try: block’s protection.

#### Raise an exception

You can raise exceptions in several ways by using the raise statement. The general syntax for the raise statement is as follows.

raise [Exception [, args [, traceback]]]

Here, Exception is the type of exception (for example, NameError) and argument is a value for the exception argument. The argument is optional; if not supplied, the exception argument is None.

The final argument, traceback, is also optional (and rarely used in practice), and if present, is the traceback object used for the exception.

### Functions

A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for your application and a high degree of code reusing.

As you already know, Python gives you many built-in functions like print(), etc. but you can also create your own functions. These functions are called user-defined functions.

The names of built-in functions are usually considered as new reserved words, i.e. we do not use them as variable namesThe names of built-in functions are usually considered as new reserved words, i.e. we do not use them as variable names.

#### Defining a Function

You can define functions to provide the required functionality. Here are simple rules to define a function in Python.

• Function blocks begin with the keyword def followed by the function name and parentheses ( ( ) ).
• Any input parameters or arguments should be placed within these parentheses. You can also define parameters inside these parentheses.
• The first statement of a function can be an optional statement - the documentation string of the function or docstring.
• The code block within every function starts with a colon (:) and is indented.
• The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None.
• If one function does not return a value, it is a void function. Return None by default.

#### Function Arguments

You can call a function by using the following types of formal arguments:

• Required arguments
• Keyword arguments
• Default arguments
• Variable-length arguments
##### Required arguments

Required arguments are the arguments passed to a function in correct positional order. Here, the number of arguments in the function call should match exactly with the function definition.

##### Keyword arguments

Keyword arguments are related to the function calls. When you use keyword arguments in a function call, the caller identifies the arguments by the parameter name.

This allows you to skip arguments or place them out of order because the Python interpreter is able to use the keywords provided to match the values with parameters.

The following example gives clear picture. Note that the order of parameters does not matter.

##### Default arguments

A default argument is an argument that assumes a default value if a value is not provided in the function call for that argument. The following example gives an idea on default arguments, it prints default age if it is not passed.

##### Variable-length arguments

You may need to process a function for more arguments than you specified while defining the function. These arguments are called variable-length arguments and are not named in the function definition, unlike required and default arguments.

Syntax for a function with non-keyword variable arguments is this −

An asterisk (*) is placed before the variable name that holds the values of all non-keyword variable arguments. This tuple remains empty if no additional arguments are specified during the function call. Following is a simple example −

#### The return Statement

The statement return [expression] exits a function, optionally passing back an expression to the caller. A return statement with no arguments is the same as return None.

##### Return multiple values

Python allows a function to return multiple values. The sort function returns two values; when it is invoked, you need to pass the returned values in a simultaneous assignment. Try to run it!