1. Introduction du logiciel

Les sources de données relatives au Sénégal sont essentiellement:

  • Open Data Plateforme (ODP) ;
  • Archivage national des données du Sénégal (ANADS) ;
  • Demographic and Health Surveys (DHS).

C'est la version 18 qui est utilisée dans cette présentation.

1.1. Fenêtres de l'interface

image

1.2. Calculatrice

In [ ]:
display as txt "La somme est de S = " as res 1+6
display as res 7-5
display 2*7
display 17/3
display int(17/3)
display mod(17,3)
display 2^3
display exp(1)
display sin(_pi/2)
display comb(10,2)
mata: factorial(3)

1.3. Commandes système

In [ ]:
version
dir
ls *.png
findfile stata.png
copy stata.png stata_new.png
ls stata*
rm stata_new.png // erase
sysdir list
sysdir
pwd
cd "C:\Users\ibtall\Documents\PERSONNAL\COURS"
mkdir mondoc
rmdir mondoc
cd ..
cd ./COURS
which display
help display
search xtable
In [ ]:
ssc new
ssc whatshot, n(3)
ssc hot, n(3)
ssc describe w
ssc describe ereplace
ssc install ereplace
ssc install elabel
ssc install dummies
ssc uninstall dummies
In [ ]:
net
net query
net from C:\Users\ibtall\Documents
net cd C:\Users\ibtall\Documents\PERSONNAL
net link sj
net describe dummies, from(C:\Users\ibtall\ado\plus)
net set ado C:\Users\ibtall\ado\plus
net set other C:\Users\ibtall\Documents
net search dups, from(C:\Users\ibtall\Documents)
net get dups, from(C:\Users\ibtall\Documents)
net install schemepack, replace
In [ ]:
ado, find("label") from(C:\Users\ibtall\ado\plus)
ado dir, find("la") from(C:\Users\ibtall\ado\plus)
ado describe, find("label") from(C:\Users\ibtall\ado\plus)
ado uninstall dups, from(C:\Users\ibtall\ado\plus\d)

3. Exploitation de la base

3.1. Importation et exportation de données

In [ ]:
webuse systolic, clear
sysuse auto, clear
sysuse dir
import excel MaBase.xlsx, sheet("STORMS") cellrange(B2:O19538) firstrow clear
save mabase, replace
export excel MaBase.xlsx, sheet("STORMS BIS") cellrange(B2:O19538) firstrow(varlabel) replace
isid make
merge 1:1 make using mabase.dta, generate(related) keepusing(foreign rep78 length) keep(matched)
append using mabase.dta, generate(linked)

Les commentaires sont explimés par l'asterix (*), le double (//) et triple (///) slash.

3.2. Transformations de variables

In [ ]:
notes: Les voitures américaines 
notes make: La marque et la serie de la voiture
notes
notes replace _dta in 2: Les voitures d'occasion
notes _dta
notes search voiture
notes drop _dta in 2
notes list
In [ ]:
rename price cout
rename * v#, addnumber
rename * v#, addnumber(20)
rename v# v#, renumber(0) sort
rename (v1 v2 v3)(Marque cout kilometrage)
rename v?, upper
rename *, lower
rename v* (make price mpg rep78 headroom trunk weight length turn displacement gear_ratio foreign)
In [ ]:
sort make
gsort foreign -price 
order turn foreign, after(make)
order foreign, last
In [ ]:
assert inrange(price,0, 100000)
list price mpg rep78 headroom trunk weight length in 1/7
edit
browse

3.3. Statistiques usuelles

In [ ]:
describe
codebook price foreign
count if price <= 5000
by foreign, sort: count if price <= 5000
inspect price
summarize weight
summarize weight, detail
bysort foreign: summarize price
statsby, by(foreign) saving(myfile, replace): summarize price weight 
use mabase, clear
collapse (count) Nb = make (mean) Prix = price (median) Poids_median = weight, by(rep78 foreign)
mean price weight, over(foreign)
total price weight, over(foreign)
proportion repeted, over(foreign)
ratio ppoids: price/weight, over(foreign)

3.4. Tableaux de statistiques

In [ ]:
tab1 foreign rep78
tab2 rep78 foreign
tabulate foreign, summarize(price)
tabulate rep78 foreign, row nofreq
tabulate rep78 foreign, summarize(price) means
tabstat price weight mpg, by(foreign) statistics(mean)
In [ ]:
table (rep78)(foreign), statistic(mean price) statistic(median weight)
table ()(var), statistic(mean price weight) statistic(fvfrequency foreign) ///
 command(regress price length i.rep78) name(mytab)
collect export mytables.xlsx, name(mytab) as(xlsx) sheet(table) cell(B2) modify
dtable, by(rep78) continuous(price length, statistics(mean) test(kwallis)) ///
 factor(foreign, statistics(fvpercent) test(kendall)) nformat(%12.0f) ///
 name(mydtab) export("mytables.xlsx", as(xlsx) sheet(mafeuil) cell(A2) modify)
regress price weight mpg length i.rep78
estimates store mymod
etable, estimates(mymod) export(mytables.xlsx, sheet(etable) modify) name(myreg)
collect dir
collect clear

5. Transformations

5.1. Manipulation de variables

In [ ]:
generate cout = rep78 * 12500
generate loi1 = runiform()
generate loi2 = runiform()
compare loi1 loi2
set seed 123456
generate loi1u = runiform()
set seed 123456
generate loi2u = runiform()
compare loi2u loi1u
generate marque = word(make, 1)
bysort rep78 foreign: egen vprice = mean(price)
replace loi = rnormal()
bysort rep78 foreign: ereplace vprice = total(price)
generate cpoids = cond(mpg <= 20, 1, 2)
egen prix_cl = cut(price), at(3291, 5000, 10000, 15906) icodes
egen mpg_cl = cut(mpg), group(3)
generate weight_cl = autocode(weight, 4, 1760, 4840)
recode weight_cl (2530 = 1 "Légère")(3300 = 2 "Moins lourde")(4070 = 3 "Lourde")(else = 4 "Très lourde"), generate(new_weight)
label variable cout "Le coût de réparation"
elabel variable (loi marque)("Loi Uniforme" "La marque de la voiture")
label define prix_cod 0 "Moins cher" 1 Abordable 2 Cher 3 "Très Cher"
label values prix_cl prix_cod
label define fcode 1 Domesti 2 Foreign
encode foreigntxt, generate(foreigncod) label(fcode)
decode foreign, generate(foreigntxt) maxlength(7)
tostring gear_ratio, generate(geartxt) force
destring geartxt, generate(gearnum) ignore("." "/") force
drop cout loi
keep make price mpg marque prix_cl foreign rep78
keep if !missing(price)
drop in 1/22

5.2. Boucles et iterations

In [ ]:
forvalues i = 5 10 to 25 {
    display `i'
}
foreach v in moy var {
    display strlen("`v'") 
}
levelsof foreign, local(lniv)
foreach z of local lniv {
    di "-> factor = `:label (foreign) `z''"
    count if foreign == `z'
}
local i = 0
while `i' <= 5 {
    display `i'
    local ++i
}

6. Illustration graphique

In [ ]:
set scheme gg_tableau
graph bar (percent), over(rep78) blabel(bar, format(%9.2f) color(red))
graph hbar (percent), over(rep78) blabel(bar, format(%9.2f)) scheme(swift_red)
graph bar (percent), over(rep78) over(foreign) asyvars percentages blabel(bar, position(center) format(%9.1f))
graph bar (percent), over(rep78) over(foreign) asyvars percentages stack ///
 blabel(bar, position(center) format(%9.1f)) scheme(black_tableau)
graph box price, over(foreign) 
graph pie, over(rep78) pie(_all, explode(10)) plabel(_all percent, color(blue) format(%4.1f))