-
Notifications
You must be signed in to change notification settings - Fork 4
/
04_read_us_trade.log
634 lines (482 loc) · 16.3 KB
/
04_read_us_trade.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 13.1 Copyright 1985-2013 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
Special Edition College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 stata@stata.com
979-696-4601 (fax)
6-user Stata network perpetual license:
Serial number: 501306214221
Licensed to: Miklos Koren
Central European University
Notes:
1. (-v# option or -set maxvar-) 5000 maximum variables
2. Command line editing disabled
3. Stata running in batch mode
Note: Your site can add messages to the introduction by editing the file
stata.msg in the directory where Stata is installed.
. do code/04_read_us_trade.do
. /*
> The programs and data files replicate the descriptive statistics and the esti
> mation results in the paper
>
> Hornok, Cec’lia and Mikl—s Koren, forthcoming. ÒPer-Shipment Costs an
> d the Lumpiness of International Trade.Ó Review of Economics and Statistics.
>
> Please cite the above paper when using these programs.
>
> For your convenience, we reproduce some of the data resources here. Although
> all of these are widely available macroeconomic data, please check with the d
> ata vendors whether you have the right to use them.
>
> Our software and data are provided AS IS, and we assume no liability for thei
> r use or misuse.
>
> If you have any questions about replication, please contact Mikl—s Koren at k
> orenm@ceu.hu.
> */
.
. *** Reads data from the monthly trade statistics published by the U.S. Census
> Bureau, merges with importer-country variables and reports shipment-level st
> atistics. The output is data/derived/usa_export_2009_with_gravity.dta.
.
. clear all
. set more off
.
. tempfile census units
.
. * units of measurement come from a different file
. * using the December commodity-level file is sufficient
. infix str commodity 2-11 str UNIT_QY1 62-64 str UNIT_QY2 65-67 using data/cen
> sus/trade/12/EXP_COMM.TXT, clear
(16670 observations read)
. duplicates drop commodity, force
Duplicates in terms of commodity
(7919 observations deleted)
. save `units', replace
(note: file /tmp/St27755.000002 not found)
file /tmp/St27755.000002 saved
.
. clear
. save `census', replace emptyok
(note: dataset contains 0 observations)
(note: file /tmp/St27755.000001 not found)
file /tmp/St27755.000001 saved
.
. foreach month of any 01 02 03 04 05 06 07 08 09 10 11 12 {
2.
. infix df 1-1 str commodity 2-11 cty_code 12-15 district 16-17 year 18
> -21 month 22-23 cards_mo 24-38 qty1_mo 39-53 qty2_mo 54-68 ///
> all_val_mo 69-83 air_val_mo 84-98 air_wgt_mo 99-113 v
> es_val_mo 114-128 ves_wgt_mo 129-143 cnt_val_mo 144-158 cnt_wgt_mo 159-173 //
> /
> cards_yr 174-188 qty1_yr 189-203 qty2_yr 204-218 all_
> val_yr 219-233 air_val_yr 234-248 air_wgt_yr 249-263 ves_val_yr 264-278 ///
> ves_wgt_yr 279-293 cnt_val_yr 294-308 cnt_wgt_yr 309-
> 323 using data/census/trade/`month'/EXP_DETL.TXT, clear
3.
. keep if cards_mo!=0
4. keep if df==1 /* exclude re-exports */
5. drop df district year cnt_* *_yr
6.
. append using `census'
7. save `census', replace
8.
. }
(367692 observations read)
(0 observations deleted)
(65560 observations deleted)
file /tmp/St27755.000001 saved
(542778 observations read)
(166042 observations deleted)
(66686 observations deleted)
file /tmp/St27755.000001 saved
(686584 observations read)
(287541 observations deleted)
(71081 observations deleted)
file /tmp/St27755.000001 saved
(797604 observations read)
(411830 observations deleted)
(68756 observations deleted)
file /tmp/St27755.000001 saved
(895509 observations read)
(505629 observations deleted)
(69043 observations deleted)
file /tmp/St27755.000001 saved
(983821 observations read)
(590914 observations deleted)
(70304 observations deleted)
file /tmp/St27755.000001 saved
(1065372 observations read)
(667626 observations deleted)
(71811 observations deleted)
file /tmp/St27755.000001 saved
(1141627 observations read)
(743807 observations deleted)
(71629 observations deleted)
file /tmp/St27755.000001 saved
(1212949 observations read)
(810688 observations deleted)
(73554 observations deleted)
file /tmp/St27755.000001 saved
(1286132 observations read)
(861322 observations deleted)
(77602 observations deleted)
file /tmp/St27755.000001 saved
(1349923 observations read)
(942399 observations deleted)
(74455 observations deleted)
file /tmp/St27755.000001 saved
(1412421 observations read)
(998574 observations deleted)
(76941 observations deleted)
file /tmp/St27755.000001 saved
.
. ren cty_code census_code
. compress
census_code was float now int
month was float now byte
cards_mo was float now int
(27,290,326 bytes saved)
.
. * add units of measurement
. merge m:1 commodity using `units',
Result # of obs.
-----------------------------------------
not matched 19
from master 0 (_merge==1)
from using 19 (_merge==2)
matched 3,898,618 (_merge==3)
-----------------------------------------
. drop _m
.
. destring commodity, replace
commodity has all characters numeric; replaced as double
.
.
. * labeling
. lab var commodity "10-digit product code"
. lab var census_code "Census country code"
. lab var month "Month"
. lab var cards_mo "Number of shipments"
. lab var qty1_mo "Traded quantity - in 1st unit of measure"
. lab var qty2_mo "Traded quantity - in 2nd unit of measure"
. lab var all_val_mo "Traded value"
. lab var air_val_mo "Traded value, shipped by air"
. lab var air_wgt_mo "Traded weight in kg, shipped by air"
. lab var ves_val_mo "Traded value, shipped by vessel"
. lab var ves_wgt_mo "Traded weight in kg, shipped by vessel"
. lab var UNIT_QY1 "1st quantity unit of measure"
. lab var UNIT_QY2 "2nd quantity unit of measure"
.
.
. * drop low value shipment lines
. drop if comm==9809005000 /* SHIPMENTS VALUED $20,000 AND UNDER, NOT IDENTIF
> IED BY KIND */
(4 observations deleted)
. drop if comm==9880002000 /* CANADA LOW VAL SHIPMTS & SHIPMTS NOT IDEN BY K
> IND */
(12 observations deleted)
. drop if comm==9880004000 /* LOW VALUE ESTIMATE, EXCLUDING CANADA */
(2609 observations deleted)
.
.
. * unit of quantity shall be in kg
. replace qty1_mo = qty2_mo if (UNIT_QY1!="KG" & UNIT_QY2=="KG")
(150739 real changes made)
. replace UNIT_QY1 = UNIT_QY2 if (UNIT_QY1!="KG" & UNIT_QY2=="KG")
(156799 real changes made)
. drop qty2_mo UNIT_QY2
. ren qty1_mo all_qty_mo
. ren UNIT_QY1 unit_qty
. lab var all_qty_mo "Traded quantity"
. lab var unit_qty "Quantity unit of measure"
.
.
. * 6-digit HS product codes
. ren comm hs10
. gen hs6 = int(hs10/10000)
. save `census', replace
file /tmp/St27755.000001 saved
.
. * B.E.C.
. clear
. tempfile bec_codes
. import excel using data/unsd/hs2bec.xls, sheet("HS2007 - BEC correlation") ce
> llrange(E7:F5058) firstrow
. destring HS2007 BEC, replace
HS2007 has all characters numeric; replaced as double
(1 missing value generated)
BEC has all characters numeric; replaced as int
(1 missing value generated)
. replace HS2007 = round(HS2007*100)
(5050 real changes made)
. ren HS2007 hs6
. ren BEC bec
. keep if !missing(hs6,bec)
(1 observation deleted)
. collapse (min) bec, by(hs6)
. save `bec_codes'
file /tmp/St27755.000003 saved
.
. use `census'
. merge m:1 hs6 using `bec_codes', keep(match)
(note: variable hs6 was float, now double to accommodate using data's values)
Result # of obs.
-----------------------------------------
not matched 0
matched 3,864,635 (_merge==3)
-----------------------------------------
. drop _m
. drop if (bec>300 & bec<400) | (bec>30 & bec<40) | bec==7 /* drop fuels and un
> classified */
(40753 observations deleted)
. lab var hs6 "6-digit HS product code"
. lab var bec "BEC product code"
.
. * keep only unique transport mode
. gen ground_val = all_val - air_val - ves_val
(18 missing values generated)
. gen byte air = air_val > 0
. gen byte sea = ves_val > 0
. gen byte ground = ground_val > 0
. gen modes = air + sea + ground
. tab modes
modes | Freq. Percent Cum.
------------+-----------------------------------
1 | 3,593,103 93.96 93.96
2 | 229,656 6.01 99.97
3 | 1,123 0.03 100.00
------------+-----------------------------------
Total | 3,823,882 100.00
. keep if modes==1
(230779 observations deleted)
. drop modes
. gen byte mode = air+2*sea+3*ground
. label var mode "Transport mode (1 Air 2 Sea 3 Ground)"
. label def mode 1 "Air" 2 "Sea" 3 "Ground"
. label val mode mode
. drop air sea ground air_val ves_val ground_val
.
. * importers list
. merge m:1 census_code using data/census/country_list, keep(match)
(note: variable census_code was int, now float to accommodate using data's
values)
Result # of obs.
-----------------------------------------
not matched 0
matched 3,577,101 (_merge==3)
-----------------------------------------
. drop _m
.
.
. * number of shipments per entry
. sum cards, detail
Number of shipments
-------------------------------------------------------------
Percentiles Smallest
1% 1 1
5% 1 1
10% 1 1 Obs 3577101
25% 1 1 Sum of Wgt. 3577101
50% 1 Mean 4.703166
Largest Std. Dev. 26.86612
75% 3 4704
90% 7 4753 Variance 721.7886
95% 14 5259 Skewness 50.94247
99% 59 6064 Kurtosis 5137.765
. * "more than half contain only one shipment"
. * "the average number of shipment is 4.8"
.
.
. ********************************************
. * Table 2: median shipment size - U.S.
. ********************************************
. gen importer = "_r.o.w"
. * selected low per-shipment cost importers
. replace importer = "1 Canada" if census_code==1220
importer was str6 now str8
(362164 real changes made)
. replace importer = "2 Germany" if census_code==4280
importer was str8 now str9
(138369 real changes made)
. replace importer = "3 Israel" if census_code==5081
(39009 real changes made)
. replace importer = "4 Singapore" if census_code==5590
importer was str9 now str11
(91890 real changes made)
. * selected high per-shipment cost importers
. replace importer = "5 Chile" if census_code==3370
(42587 real changes made)
. replace importer = "6 China" if census_code==5700
(131875 real changes made)
. replace importer = "7 Russia" if census_code==4621
(21792 real changes made)
. replace importer = "8 Venezuela" if census_code==3070
(32125 real changes made)
. lab var importer "Selected importers"
. gen ssize = all_val_mo/cards_mo
. tabstat ssize [fweight = cards_mo], statistics(median) by(importer) columns(v
> ariables)
Summary statistics: p50
by categories of: importer (Selected importers)
importer | ssize
------------+----------
1 Canada | 13940.04
2 Germany | 13769.13
3 Israel | 13230
4 Singapore | 13341.33
5 Chile | 13766.56
6 China | 29249.88
7 Russia | 28156.5
8 Venezuela | 24010.5
_r.o.w | 14257.07
------------+----------
Total | 14434.81
-----------------------
. drop ssize
.
.
. * saving labels
. foreach v of var * {
2. local l`v' : variable label `v'
3. if `"`l`v''"' == "" {
4. local l`v' "`v'"
5. }
6. }
.
. * collapse to product-mode-country-month
. collapse (sum) cards all_qty all_val air_* ves_*, by(hs10 bec census_code imp
> orter mode unit_qty month)
.
.
. * collapse to product-mode-country
. collapse (sum) cards all_qty all_val air_* ves_* (count) nmonth=month, by(hs1
> 0 bec census_code importer mode unit_qty)
.
.
. * attach labels
. foreach v of var * {
2. label var `v' "`l`v''"
3. }
. label var nmonth "Number of months with trade"
. save `census', replace
file /tmp/St27755.000001 saved
.
.
. * census country code to 2-digit iso
. merge m:1 census_code using data/census/country_list, keep(match)
Result # of obs.
-----------------------------------------
not matched 0
matched 522,929 (_merge==3)
-----------------------------------------
. drop _m
. lab var census_code "Census country code"
. lab var iso2 "2-digit ISO country code"
.
. * add country variables
. gen exporter = "USA"
. merge m:1 exporter iso2 using data/derived/merged_importer_data
Result # of obs.
-----------------------------------------
not matched 2,591
from master 2,356 (_merge==1)
from using 235 (_merge==2)
matched 520,573 (_merge==3)
-----------------------------------------
. drop _m exporter
.
.
. ************************************************************
. * Table 2: shipments per month, months with shipment - U.S.
. ************************************************************
. gen n = cards/nmonth
(235 missing values generated)
. lab var n "Number of shipments per month"
. tabstat n nmonth, statistics(median) by(importer) columns(variables)
Summary statistics: p50
by categories of: importer (Selected importers)
importer | n nmonth
------------+--------------------
1 Canada | 4.449495 12
2 Germany | 1.333333 4
3 Israel | 1 2
4 Singapore | 1.285714 4
5 Chile | 1 3
6 China | 1.5 4
7 Russia | 1 2
8 Venezuela | 1 3
_r.o.w | 1 2
------------+--------------------
Total | 1 2
---------------------------------
. drop n
.
.
. * only trade in kilogram
. gen weight = all_qty_mo if unit_qty=="KG"
(308296 missing values generated)
. replace weight = air_wgt + ves_wgt if mode!=3
(431481 real changes made)
. label var weight "Weight in kilogram"
. gen byte kg = !missing(weight)
. tab kg
kg | Freq. Percent Cum.
------------+-----------------------------------
0 | 21,986 4.20 4.20
1 | 501,178 95.80 100.00
------------+-----------------------------------
Total | 523,164 100.00
. label var kg "Trade quantity is in kilogram"
. drop air_* ves_* all_qty
.
.
. * LHS vars
. gen lx=ln(all_val)
(235 missing values generated)
. gen ln=ln(cards)
(235 missing values generated)
. gen lv=lx-ln
(235 missing values generated)
. gen lh=ln(nmonth)
(235 missing values generated)
. gen lnh=ln-lh
(235 missing values generated)
. gen lq=ln(weight)-ln
(22206 missing values generated)
. gen lp=lv-lq
(22206 missing values generated)
. drop if weight==0 /* for these lq and lp are not defined */
(220 observations deleted)
.
.
. * label LHS vars
. label var lx "ln total margin per importer-product"
. label var ln "ln number of shipments"
. label var lh "ln number of months with shipment"
. label var lnh "ln average shipment number per month"
. label var lv "ln value ssize"
. label var lq "ln quantity ssize"
. label var lp "ln price"
.
.
. * clustering groups
. gen hs2 = int(hs10/100000000)
(235 missing values generated)
. lab var hs2 "2-digit HS product code"
. egen cty_hs2 = group(census_code hs2)
(235 missing values generated)
.
.
. * product-mode dummies
. egen prodmode = group(hs10 mode)
(235 missing values generated)
.
. saveold data/derived/usa_export_2009_with_gravity, replace
(note: file data/derived/usa_export_2009_with_gravity.dta not found)
file data/derived/usa_export_2009_with_gravity.dta saved
.
end of do-file