Skip to content

Data Lab

Posted on:2021.08.31

TOC

Open TOC

Data Lab

CSAPP Lab 下载对应的 tar 包并解压。

环境配置:

# sudo apt-get install gcc-multilib

README 文件解读:

  1. bits.c 中进行实验
  2. 通过 dlc 检查 bits.c 是否符合编码规则
  3. 通过 make btest 编译 bits.c,第一次会生成 btest 文件
  4. 通过 btest 检查代码正确性,btest 不会检查代码是否符合编码规则
/*
* bitXor - x^y using only ~ and &
* Example: bitXor(4, 5) = 1
* Legal ops: ~ &
* Max ops: 14
* Rating: 1
*/
int bitXor(int x, int y) {
int a = x & ~y;
int b = ~x & y;
return ~((~a)&(~b));
}
/*
* tmin - return minimum two's complement integer
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 4
* Rating: 1
*/
int tmin(void) {
return (1<<31);
}
/*
* isTmax - returns 1 if x is the maximum, two's complement number,
* and 0 otherwise
* Legal ops: ! ~ & ^ | +
* Max ops: 10
* Rating: 1
*/
int isTmax(int x) {
int a = !((x+1)^(~x));
int b = !!~x;
return a&b;
}

a 的作用是判断 x+1~x 是否相等,b 的作用是排除特例 -1

/*
* allOddBits - return 1 if all odd-numbered bits in word set to 1
* where bits are numbered from 0 (least significant) to 31 (most significant)
* Examples allOddBits(0xFFFFFFFD) = 0, allOddBits(0xAAAAAAAA) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 12
* Rating: 2
*/
int allOddBits(int x) {
int mask = 0xaa;
int a = x&mask;
int b = x>>8&mask;
int c = x>>16&mask;
int d = x>>24&mask;
return !(0x2a8^(a+b+c+d));
}

别想太多,每一次测试 8 位即可,注意 0x2a8=4*0xaa,且正好用了 12 个运算符……

注:上面的写法使用了常数 0x2a8,不符合编码规范,且相加操作也许会有问题,所以可以像下面逐步构造 mask,最后一步到位

int allOddBits(int x) {
int mask = 0xaa;
mask = mask << 8 | mask;
mask = mask << 8 | mask;
mask = mask << 8 | mask;
return !(mask^(mask&x));
}
/*
* negate - return -x
* Example: negate(1) = -1.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 5
* Rating: 2
*/
int negate(int x) {
return ~x+1;
}
//3
/*
* isAsciiDigit - return 1 if 0x30 <= x <= 0x39 (ASCII codes for characters '0' to '9')
* Example: isAsciiDigit(0x35) = 1.
* isAsciiDigit(0x3a) = 0.
* isAsciiDigit(0x05) = 0.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 15
* Rating: 3
*/
int isAsciiDigit(int x) {
int mask = 0x3;
int a = !(mask^(x>>4&mask)); // 1
int b = x>>6; // 0
int c = (x>>3)&0x1;
int d = (x&0x7)>>1;
// exclude c = 1 and d = 1
return a&!b&(!c|!d);
}

正好用了 15 个运算符……

a 判断从低到高的第 5 位是否为 3b 判断更高位是否都是 0c 取从低到高的第 4 位,d 取从低到高的第 2 位和第 3 位,若 c1,则 d 必须为 0,若 c0,则 d 无所谓,总之 cd 不能同时为 1

后来感觉想复杂了,只要这样:

int isAsciiDigit(int x) {
int a = (x+((~0x30)+1))>>31;
int b = (x+((~0x3a)+1))>>31;
return (!a)&b;
}
/*
* conditional - same as x ? y : z
* Example: conditional(2,4,5) = 4
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 16
* Rating: 3
*/
int conditional(int x, int y, int z) {
int cond = !!x;
int mask = ~cond+1;
return (y&mask)|(z&~mask);
}

通过双重 !! 归一化,若 cond1,则 mask 为全 1,否则为全 0,由此实现选择

/*
* isLessOrEqual - if x <= y then return 1, else return 0
* Example: isLessOrEqual(4,5) = 1.
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 24
* Rating: 3
*/
int isLessOrEqual(int x, int y) {
int a = x>>31;
int b = y>>31;
int c = !((y+(~x+1))>>31); // y - x >= 0
int d = a&(!b); // x < 0 <= y
int e = (!a)&b; // y < 0 <= x
int f = !(x^(1<<31)); // x is min
int g = !(x^y); // x == y
return (c|d|f|g)&(!e);
}

需要注意 xy 异号的情形,此时 y-x 会溢出,所以需要单独讨论这两种情形,另外 min 取反加一还是自身,也需要单独讨论,对于 xy 相等的情形也(许)要单独讨论

最后汇总时,将结果为 1 的情形相或,并与上结果为 0 的情形的非

/*
* logicalNeg - implement the ! operator, using all of
* the legal operators except !
* Examples: logicalNeg(3) = 0, logicalNeg(0) = 1
* Legal ops: ~ & ^ | + << >>
* Max ops: 12
* Rating: 4
*/
int logicalNeg(int x) {
int a = x|(~x+1);
int b = a>>31;
return b+1;
}

这题一直想不出来 🤣,解释一下,将自身与自身取负相或(怎么想出来的 😒),除了 0 之外,所有的结果的最高位都是 1,算术右移后,0 对应 0其余对应 0xffffffff,加 1 就可以实现了

/* howManyBits - return the minimum number of bits required to represent x in
* two's complement
* Examples: howManyBits(12) = 5
* howManyBits(298) = 10
* howManyBits(-5) = 4
* howManyBits(0) = 1
* howManyBits(-1) = 1
* howManyBits(0x80000000) = 32
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 90
* Rating: 4
*/
int howManyBits(int x) {
int a, b, c, d, e, f;
int cond = !!(x>>31);
int mask = ~cond+1;
x = (mask&(~x))|((~mask)&x);
a = !!(x>>16);
mask = ~a+1;
x = (mask&(x>>16))|((~mask)&x);
b = !!(x>>8);
mask = ~b+1;
x = (mask&(x>>8))|((~mask)&x);
c = !!(x>>4);
mask = ~c+1;
x = (mask&(x>>4))|((~mask)&x);
d = !!(x>>2);
mask = ~d+1;
x = (mask&(x>>2))|((~mask)&x);
e = !!(x>>1);
mask = ~e+1;
x = (mask&(x>>1))|((~mask)&x);
f = !!(x);
return (a<<4)+(b<<3)+(c<<2)+(d<<1)+e+f+1;
}

这一题就比较唬人了 🤣,竟然给了 90 个操作符,首先我们分析一下样例,对于正数而言,从最高位往低位数,碰到 1 停止,对于负数而言,从最高位往低位数,碰到 0 停止。为了统一这两种情况,我们根据符号位将负数取反,正数则保持不变。

接下来的部分就比较魔幻了 🤣,显然我们不能一位一位的判断,因此我们使用二分的思想。先判断高 16 位中是否有 1,并使用局部变量(编码规则要求所有的变量在最开始声明)记录下来,若有 1,我们将数右移 16 位;否则高 16 位都是 0,只需判断低 16 位即可(保持不变)。如此进行,最后将记录的局部变量附上一定的权重,因为符号位总是 0,所以最后要加 1(考虑 howManyBits(0) = 1)。

/*
* floatScale2 - Return bit-level equivalent of expression 2*f for
* floating point argument f.
* Both the argument and result are passed as unsigned int's, but
* they are to be interpreted as the bit-level representation of
* single-precision floating point values.
* When argument is NaN, return argument
* Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
* Max ops: 30
* Rating: 4
*/
unsigned floatScale2(unsigned uf) {
int exp = (uf>>23)&0xff;
int mask = (1<<22)&uf;
if(exp==0xff){ // inf or nan
return uf;
}
else if(exp==0){ // denormalized
if(mask==0){
return ((uf&0x3fffff)<<1)|(uf&0x80000000);
}
else{
return ((uf&0x7fffff)<<1)|(uf&0x80000000);
}
}
else{ // normalized
exp=exp+1;
return (uf&0x807fffff)|(exp<<23);
}
}

终于到了浮点数的部分,我们能够使用的操作变多了 🤣,比如任意大小的常数,==if 语句等。

这题考察对浮点数表示的理解,注意到这里是 32 位浮点数,所以符号为 1 位,阶码为 8 位,尾数为 23 位。先根据阶码判断是否为无穷大或者 NaN,这种情形直接返回。若是规格化的值,我们将阶码加一即可(此处可能会溢出,不过题目中并没有特殊的要求)。若是非规格化的值,我们需要根据尾数的最高位分类讨论,若尾数的最高位为 0,则将尾数部分左移一位即可;否则,该浮点表示从非规格化表示变为了规格化表示。尾数部分应该乘二减一,阶码部分加一,整体考虑就相当于将尾数部分左移一位(使阶码部分的最低位变为 1)。

/*
* floatFloat2Int - Return bit-level equivalent of expression (int) f
* for floating point argument f.
* Argument is passed as unsigned int, but
* it is to be interpreted as the bit-level representation of a
* single-precision floating point value.
* Anything out of range (including NaN and infinity) should return
* 0x80000000u.
* Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
* Max ops: 30
* Rating: 4
*/
int floatFloat2Int(unsigned uf) {
int s = uf>>31;
int exp = (uf>>23)&0xff;
int frac = uf&0x007fffff;
int c = 23;
if(exp==0xff){ // inf or nan
return 1<<31;
}else if(exp<=126){
return 0;
}
else if(exp>=158){
return 1<<31;
}else{
frac=frac|(1<<23);
exp -= 127;
while(exp>=1){
if(c>=1){
c=c-1;
}else{
frac=frac*2;
}
exp=exp-1;
}
while(c>=1){
frac=frac/2;
c=c-1;
}
if(s==0){
return frac;
}else{
return ~frac+1;
}
}
}

首先,浮点数转补码,向零舍入。其次,需要对阶码进行精细的讨论:

  1. 阶码全为 1:无穷大或者 NaN
  2. 阶码 ≤ 126:若为非规格化,显然舍入为 0,若为规格化,尾数 ∈[1,2),所以只要 E<=-1,即阶码 ≤ 126,就会舍入为 0
  3. 阶码 ≥ 158:此时 E ≥ 31,对于正数而言,结果必定 ≥ 2^31,显然溢出,对于负数而言,会包含 -2^31,然而 -2^31 的表示就是 1<<31,所以不需要单独讨论
  4. 其余情形:将尾数提取出来并或上 1,记为 1+f,不考虑符号,最终结果为 (1+f)*2^E/2^23,由于直接计算会溢出,将 E23 的先抵消一部分,也就是计算 (1+f)*2^(E-23),最后根据符号位进行取反加一处理
/*
* floatPower2 - Return bit-level equivalent of the expression 2.0^x
* (2.0 raised to the power x) for any 32-bit integer x.
*
* The unsigned value that is returned should have the identical bit
* representation as the single-precision floating-point number 2.0^x.
* If the result is too small to be represented as a denorm, return
* 0. If too large, return +INF.
*
* Legal ops: Any integer/unsigned operations incl. ||, &&. Also if, while
* Max ops: 30
* Rating: 4
*/
unsigned floatPower2(int x) {
int pos_inf = 0x7f800000;
int ans = 0;
if(x<-149){
return 0;
}else if(x>127){
return pos_inf;
}else if(x>=-126){
x+=127;
return x<<23;
}else{
x+=149;
while(x>0){
ans=ans<<1|1;
x--;
}
return x;
}
}

最后一题还挺简单,只要知道单精度浮点数表示的范围 ∈[2^(-23)*2^(-126),(2-2^(-23))*2^127] 即可

Terminal window
vgalaxy@vgalaxy-VirtualBox:~/Lab/datalab-handout$ ./dlc bits.c
vgalaxy@vgalaxy-VirtualBox:~/Lab/datalab-handout$ ./btest
Score Rating Errors Function
1 1 0 bitXor
1 1 0 tmin
1 1 0 isTmax
2 2 0 allOddBits
2 2 0 negate
3 3 0 isAsciiDigit
3 3 0 conditional
3 3 0 isLessOrEqual
4 4 0 logicalNeg
4 4 0 howManyBits
4 4 0 floatScale2
4 4 0 floatFloat2Int
4 4 0 floatPower2
Total points: 36/36
vgalaxy@vgalaxy-VirtualBox:~/Lab/datalab-handout$ date
2021 09 03 星期五 15:39:42 CST