Overview
Brought to you by YData
Dataset statistics
Number of variables | 16 |
---|---|
Number of observations | 16,270 |
Missing cells | 45,187 |
Missing cells (%) | 17.4% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 2.0 MiB |
Average record size in memory | 128.0 B |
Variable types
Text | 1 |
---|---|
Categorical | 12 |
Numeric | 2 |
Boolean | 1 |
employment_status_of_the_main_occupation is highly overall correlated with main_activity_engaged_in | High correlation |
ethnicity is highly overall correlated with religion | High correlation |
gender is highly overall correlated with relationship_to_the_head_of_household | High correlation |
main_activity_engaged_in is highly overall correlated with employment_status_of_the_main_occupation | High correlation |
relationship_to_the_head_of_household is highly overall correlated with gender | High correlation |
religion is highly overall correlated with ethnicity | High correlation |
ethnicity is highly imbalanced (69.9%) | Imbalance |
current_attendance_in_any_education_instituition is highly imbalanced (56.1%) | Imbalance |
current_attendance_in_any_education_instituition has 418 (2.6%) missing values | Missing |
highest_level_of_education has 775 (4.8%) missing values | Missing |
main_activity_engaged_in has 2133 (13.1%) missing values | Missing |
main_occupation has 9902 (60.9%) missing values | Missing |
daily_wage_owner_or_not has 10090 (62.0%) missing values | Missing |
employment_status_of_the_main_occupation has 9902 (60.9%) missing values | Missing |
member_went_out_for_work_or_not_during_last_week has 11967 (73.6%) missing values | Missing |
no_of_hours_stayed_at_home_during_last_week has 699 (4.3%) zeros | Zeros |
Reproduction
Analysis started | 2024-12-06 05:54:26.223750 |
---|---|
Analysis finished | 2024-12-06 05:54:28.327003 |
Duration | 2.1 seconds |
Software version | ydata-profiling vv4.11.0 |
Download configuration | config.json |
Variables
household_ID
Text
Distinct | 4063 |
---|---|
Distinct (%) | 25.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Value | Count | Frequency (%) |
id0349 | 13 | 0.1% |
id3438 | 13 | 0.1% |
id0849 | 12 | 0.1% |
id1781 | 12 | 0.1% |
id2880 | 12 | 0.1% |
id0939 | 12 | 0.1% |
id3013 | 11 | 0.1% |
id0699 | 11 | 0.1% |
id2896 | 11 | 0.1% |
id2341 | 11 | 0.1% |
Other values (4053) | 16152 |
Most occurring characters
Value | Count | Frequency (%) |
I | 16270 | |
D | 16270 | |
0 | 9274 | |
1 | 8926 | |
2 | 8804 | |
3 | 8734 | |
4 | 5068 | 5.2% |
8 | 5009 | 5.1% |
6 | 4859 | 5.0% |
5 | 4827 | 4.9% |
Other values (2) | 9579 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 65080 | |
Uppercase Letter | 32540 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 9274 | |
1 | 8926 | |
2 | 8804 | |
3 | 8734 | |
4 | 5068 | |
8 | 5009 | |
6 | 4859 | |
5 | 4827 | |
7 | 4802 | |
9 | 4777 |
Uppercase Letter
Value | Count | Frequency (%) |
I | 16270 | |
D | 16270 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 65080 | |
Latin | 32540 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 9274 | |
1 | 8926 | |
2 | 8804 | |
3 | 8734 | |
4 | 5068 | |
8 | 5009 | |
6 | 4859 | |
5 | 4827 | |
7 | 4802 | |
9 | 4777 |
Latin
Value | Count | Frequency (%) |
I | 16270 | |
D | 16270 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 97620 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
I | 16270 | |
D | 16270 | |
0 | 9274 | |
1 | 8926 | |
2 | 8804 | |
3 | 8734 | |
4 | 5068 | 5.2% |
8 | 5009 | 5.1% |
6 | 4859 | 5.0% |
5 | 4827 | 4.9% |
Other values (2) | 9579 |
member_ID
Categorical
Distinct | 13 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
I1 | |
---|---|
I2 | |
I3 | |
I4 | |
I5 | |
Other values (8) |
Common Values
Value | Count | Frequency (%) |
I1 | 4063 | |
I2 | 3877 | |
I3 | 3275 | |
I4 | 2457 | |
I5 | 1443 | 8.9% |
I6 | 671 | 4.1% |
I7 | 264 | 1.6% |
I8 | 120 | 0.7% |
I9 | 55 | 0.3% |
I10 | 26 | 0.2% |
Other values (3) | 19 | 0.1% |
Length
Value | Count | Frequency (%) |
i1 | 4063 | |
i2 | 3877 | |
i3 | 3275 | |
i4 | 2457 | |
i5 | 1443 | 8.9% |
i6 | 671 | 4.1% |
i7 | 264 | 1.6% |
i8 | 120 | 0.7% |
i9 | 55 | 0.3% |
i10 | 26 | 0.2% |
Other values (3) | 19 | 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
I | 16270 | |
1 | 4119 | 12.6% |
2 | 3883 | 11.9% |
3 | 3277 | 10.1% |
4 | 2457 | 7.5% |
5 | 1443 | 4.4% |
6 | 671 | 2.1% |
7 | 264 | 0.8% |
8 | 120 | 0.4% |
9 | 55 | 0.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 16315 | |
Uppercase Letter | 16270 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 4119 | |
2 | 3883 | |
3 | 3277 | |
4 | 2457 | |
5 | 1443 | 8.8% |
6 | 671 | 4.1% |
7 | 264 | 1.6% |
8 | 120 | 0.7% |
9 | 55 | 0.3% |
0 | 26 | 0.2% |
Uppercase Letter
Value | Count | Frequency (%) |
I | 16270 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 16315 | |
Latin | 16270 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 4119 | |
2 | 3883 | |
3 | 3277 | |
4 | 2457 | |
5 | 1443 | 8.8% |
6 | 671 | 4.1% |
7 | 264 | 1.6% |
8 | 120 | 0.7% |
9 | 55 | 0.3% |
0 | 26 | 0.2% |
Latin
Value | Count | Frequency (%) |
I | 16270 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 32585 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
I | 16270 | |
1 | 4119 | 12.6% |
2 | 3883 | 11.9% |
3 | 3277 | 10.1% |
4 | 2457 | 7.5% |
5 | 1443 | 4.4% |
6 | 671 | 2.1% |
7 | 264 | 0.8% |
8 | 120 | 0.4% |
9 | 55 | 0.2% |
age
Real number (ℝ)
Distinct | 97 |
---|---|
Distinct (%) | 0.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 38.39287 |
Minimum | 0 |
---|---|
Maximum | 98 |
Zeros | 149 |
Zeros (%) | 0.9% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 127.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 5 |
Q1 | 19 |
median | 38 |
Q3 | 56 |
95-th percentile | 75 |
Maximum | 98 |
Range | 98 |
Interquartile range (IQR) | 37 |
Descriptive statistics
Standard deviation | 22.075172 |
---|---|
Coefficient of variation (CV) | 0.57498103 |
Kurtosis | -0.99725102 |
Mean | 38.39287 |
Median Absolute Deviation (MAD) | 18 |
Skewness | 0.14195376 |
Sum | 624652 |
Variance | 487.31322 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
19 | 282 | 1.7% |
17 | 278 | 1.7% |
23 | 276 | 1.7% |
15 | 267 | 1.6% |
18 | 266 | 1.6% |
20 | 263 | 1.6% |
16 | 256 | 1.6% |
42 | 252 | 1.5% |
22 | 251 | 1.5% |
45 | 245 | 1.5% |
Other values (87) | 13634 |
Value | Count | Frequency (%) |
0 | 149 | |
1 | 131 | |
2 | 138 | |
3 | 186 | |
4 | 171 | |
5 | 177 | |
6 | 163 | |
7 | 164 | |
8 | 196 | |
9 | 198 |
Value | Count | Frequency (%) |
98 | 1 | < 0.1% |
96 | 1 | < 0.1% |
95 | 3 | < 0.1% |
93 | 8 | < 0.1% |
92 | 5 | < 0.1% |
91 | 7 | < 0.1% |
90 | 17 | |
89 | 20 | |
88 | 16 | |
87 | 14 |
relationship_to_the_head_of_household
Categorical
High correlation 
Distinct | 12 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Son/daughter | |
---|---|
Head of the household | |
Wife/Husband | |
Parents of the head of the Household/ spouse | |
Other relative | |
Other values (7) |
Length
Max length | 44 |
---|---|
Median length | 12 |
Mean length | 17.506884 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Head of the household |
---|---|
2nd row | Wife/Husband |
3rd row | Son/daughter |
4th row | Son-in-law/Daughter in law |
5th row | Head of the household |
Common Values
Value | Count | Frequency (%) |
Son/daughter | 5654 | |
Head of the household | 4012 | |
Wife/Husband | 3226 | |
Parents of the head of the Household/ spouse | 1198 | 7.4% |
Other relative | 685 | 4.2% |
Grandson/ Granddaughter | 666 | 4.1% |
Son-in-law/Daughter in law | 434 | 2.7% |
Boarder | 237 | 1.5% |
Domestic servant/driver/watcher | 101 | 0.6% |
Other | 52 | 0.3% |
Other values (2) | 5 | < 0.1% |
Length
Value | Count | Frequency (%) |
of | 6408 | |
the | 6408 | |
son/daughter | 5654 | |
head | 5210 | |
household | 5210 | |
wife/husband | 3226 | |
parents | 1198 | 3.1% |
spouse | 1198 | 3.1% |
other | 737 | 1.9% |
relative | 685 | 1.8% |
Other values (13) | 3086 |
Most occurring characters
Value | Count | Frequency (%) |
e | 31961 | 11.2% |
o | 25125 | 8.8% |
h | 24420 | 8.6% |
22750 | 8.0% | |
d | 21639 | 7.6% |
a | 19715 | 6.9% |
u | 16391 | 5.8% |
t | 16090 | 5.6% |
n | 13486 | 4.7% |
s | 12904 | 4.5% |
Other values (24) | 80356 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 228043 | |
Space Separator | 22750 | 8.0% |
Uppercase Letter | 21794 | 7.7% |
Other Punctuation | 11382 | 4.0% |
Dash Punctuation | 868 | 0.3% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 31961 | |
o | 25125 | |
h | 24420 | |
d | 21639 | |
a | 19715 | |
u | 16391 | |
t | 16090 | |
n | 13486 | 5.9% |
s | 12904 | 5.7% |
r | 11587 | 5.1% |
Other values (11) | 34725 |
Uppercase Letter
Value | Count | Frequency (%) |
H | 8436 | |
S | 6088 | |
W | 3226 | 14.8% |
G | 1332 | 6.1% |
P | 1198 | 5.5% |
O | 737 | 3.4% |
D | 537 | 2.5% |
B | 237 | 1.1% |
R | 3 | < 0.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 11380 | |
' | 2 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
22750 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 868 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 249837 | |
Common | 35000 | 12.3% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 31961 | |
o | 25125 | |
h | 24420 | |
d | 21639 | 8.7% |
a | 19715 | 7.9% |
u | 16391 | 6.6% |
t | 16090 | 6.4% |
n | 13486 | 5.4% |
s | 12904 | 5.2% |
r | 11587 | 4.6% |
Other values (20) | 56519 |
Common
Value | Count | Frequency (%) |
22750 | ||
/ | 11380 | |
- | 868 | 2.5% |
' | 2 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 284837 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 31961 | 11.2% |
o | 25125 | 8.8% |
h | 24420 | 8.6% |
22750 | 8.0% | |
d | 21639 | 7.6% |
a | 19715 | 6.9% |
u | 16391 | 5.8% |
t | 16090 | 5.6% |
n | 13486 | 4.7% |
s | 12904 | 4.5% |
Other values (24) | 80356 |
gender
Categorical
High correlation 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Female | |
---|---|
Male |
Common Values
Value | Count | Frequency (%) |
Female | 8386 | |
Male | 7884 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
female | 8386 | |
male | 7884 |
Most occurring characters
Value | Count | Frequency (%) |
e | 24656 | |
a | 16270 | |
l | 16270 | |
F | 8386 | 10.2% |
m | 8386 | 10.2% |
M | 7884 | 9.6% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 65582 | |
Uppercase Letter | 16270 | 19.9% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 24656 | |
a | 16270 | |
l | 16270 | |
m | 8386 | 12.8% |
Uppercase Letter
Value | Count | Frequency (%) |
F | 8386 | |
M | 7884 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 81852 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 24656 | |
a | 16270 | |
l | 16270 | |
F | 8386 | 10.2% |
m | 8386 | 10.2% |
M | 7884 | 9.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 81852 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 24656 | |
a | 16270 | |
l | 16270 | |
F | 8386 | 10.2% |
m | 8386 | 10.2% |
M | 7884 | 9.6% |
ethnicity
Categorical
High correlation  Imbalance 
Distinct | 7 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Sinhala | |
---|---|
Sri Lankan Moor/Muslim | |
Sri Lankan Tamil | 572 |
Indian Tamil | 82 |
Malay | 42 |
Other values (2) | 46 |
Common Values
Value | Count | Frequency (%) |
Sinhala | 13560 | |
Sri Lankan Moor/Muslim | 1968 | 12.1% |
Sri Lankan Tamil | 572 | 3.5% |
Indian Tamil | 82 | 0.5% |
Malay | 42 | 0.3% |
Burgher | 32 | 0.2% |
Other | 14 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
sinhala | 13560 | |
sri | 2540 | 11.9% |
lankan | 2540 | 11.9% |
moor/muslim | 1968 | 9.2% |
tamil | 654 | 3.1% |
indian | 82 | 0.4% |
malay | 42 | 0.2% |
burgher | 32 | 0.1% |
other | 14 | 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
a | 33020 | |
n | 18804 | |
i | 18804 | |
l | 16224 | |
S | 16100 | |
h | 13606 | |
5162 | 3.5% | |
r | 4586 | 3.1% |
M | 3978 | 2.7% |
o | 3936 | 2.6% |
Other values (15) | 14636 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 118326 | |
Uppercase Letter | 23400 | 15.7% |
Space Separator | 5162 | 3.5% |
Other Punctuation | 1968 | 1.3% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 33020 | |
n | 18804 | |
i | 18804 | |
l | 16224 | |
h | 13606 | |
r | 4586 | 3.9% |
o | 3936 | 3.3% |
m | 2622 | 2.2% |
k | 2540 | 2.1% |
u | 2000 | 1.7% |
Other values (6) | 2184 | 1.8% |
Uppercase Letter
Value | Count | Frequency (%) |
S | 16100 | |
M | 3978 | 17.0% |
L | 2540 | 10.9% |
T | 654 | 2.8% |
I | 82 | 0.4% |
B | 32 | 0.1% |
O | 14 | 0.1% |
Space Separator
Value | Count | Frequency (%) |
5162 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1968 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 141726 | |
Common | 7130 | 4.8% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 33020 | |
n | 18804 | |
i | 18804 | |
l | 16224 | |
S | 16100 | |
h | 13606 | |
r | 4586 | 3.2% |
M | 3978 | 2.8% |
o | 3936 | 2.8% |
m | 2622 | 1.9% |
Other values (13) | 10046 | 7.1% |
Common
Value | Count | Frequency (%) |
5162 | ||
/ | 1968 | 27.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 148856 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
a | 33020 | |
n | 18804 | |
i | 18804 | |
l | 16224 | |
S | 16100 | |
h | 13606 | |
5162 | 3.5% | |
r | 4586 | 3.1% |
M | 3978 | 2.7% |
o | 3936 | 2.6% |
Other values (15) | 14636 |
religion
Categorical
High correlation 
Distinct | 7 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Buddhism | |
---|---|
Roman Catholicism | |
Islam | |
Other Christian denominations | 547 |
Hinduism | 407 |
Other values (2) | 14 |
Common Values
Value | Count | Frequency (%) |
Buddhism | 10807 | |
Roman Catholicism | 2442 | 15.0% |
Islam | 2053 | 12.6% |
Other Christian denominations | 547 | 3.4% |
Hinduism | 407 | 2.5% |
Other | 8 | < 0.1% |
No religion | 6 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
buddhism | 10807 | |
roman | 2442 | 12.3% |
catholicism | 2442 | 12.3% |
islam | 2053 | 10.4% |
other | 555 | 2.8% |
christian | 547 | 2.8% |
denominations | 547 | 2.8% |
hinduism | 407 | 2.1% |
no | 6 | < 0.1% |
religion | 6 | < 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
d | 22568 | |
i | 18705 | |
m | 18698 | |
s | 16803 | |
h | 14351 | |
u | 11214 | |
B | 10807 | |
a | 8031 | 5.1% |
o | 5990 | 3.8% |
n | 5043 | 3.2% |
Other values (13) | 25250 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 134659 | |
Uppercase Letter | 19259 | 12.2% |
Space Separator | 3542 | 2.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
d | 22568 | |
i | 18705 | |
m | 18698 | |
s | 16803 | |
h | 14351 | |
u | 11214 | |
a | 8031 | 6.0% |
o | 5990 | 4.4% |
n | 5043 | 3.7% |
l | 4501 | 3.3% |
Other values (5) | 8755 | 6.5% |
Uppercase Letter
Value | Count | Frequency (%) |
B | 10807 | |
C | 2989 | 15.5% |
R | 2442 | 12.7% |
I | 2053 | 10.7% |
O | 555 | 2.9% |
H | 407 | 2.1% |
N | 6 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
3542 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 153918 | |
Common | 3542 | 2.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
d | 22568 | |
i | 18705 | |
m | 18698 | |
s | 16803 | |
h | 14351 | |
u | 11214 | |
B | 10807 | |
a | 8031 | 5.2% |
o | 5990 | 3.9% |
n | 5043 | 3.3% |
Other values (12) | 21708 |
Common
Value | Count | Frequency (%) |
3542 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 157460 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
d | 22568 | |
i | 18705 | |
m | 18698 | |
s | 16803 | |
h | 14351 | |
u | 11214 | |
B | 10807 | |
a | 8031 | 5.1% |
o | 5990 | 3.8% |
n | 5043 | 3.2% |
Other values (13) | 25250 |
marital_status
Categorical
Distinct | 9 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 127.2 KiB |
Currently married (registered) | |
---|---|
Never married | |
Currently married (customary) | |
Widowed | |
Other | 138 |
Other values (4) | 181 |
Length
Max length | 33 |
---|---|
Median length | 30 |
Mean length | 21.736755 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Currently married (registered) |
---|---|
2nd row | Currently married (registered) |
3rd row | Currently married (registered) |
4th row | Currently married (registered) |
5th row | Currently married (registered) |
Common Values
Value | Count | Frequency (%) |
Currently married (registered) | 7417 | |
Never married | 6330 | |
Currently married (customary) | 1311 | 8.1% |
Widowed | 893 | 5.5% |
Other | 138 | 0.8% |
Separated (not legally) | 64 | 0.4% |
Not married but lives as a Family | 52 | 0.3% |
Divorced | 44 | 0.3% |
Legally separated | 21 | 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
married | 15110 | |
currently | 8728 | |
registered | 7417 | |
never | 6330 | |
customary | 1311 | 3.2% |
widowed | 893 | 2.2% |
other | 138 | 0.3% |
not | 116 | 0.3% |
separated | 85 | 0.2% |
legally | 85 | 0.2% |
Other values (6) | 304 | 0.8% |
Most occurring characters
Value | Count | Frequency (%) |
r | 70418 | |
e | 60131 | |
d | 24442 | 6.9% |
24247 | 6.9% | |
i | 23568 | 6.7% |
t | 17847 | 5.0% |
a | 16832 | 4.8% |
m | 16473 | 4.7% |
y | 10176 | 2.9% |
u | 10091 | 2.9% |
Other values (21) | 79432 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 295504 | |
Space Separator | 24247 | 6.9% |
Uppercase Letter | 16322 | 4.6% |
Open Punctuation | 8792 | 2.5% |
Close Punctuation | 8792 | 2.5% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
r | 70418 | |
e | 60131 | |
d | 24442 | 8.3% |
i | 23568 | 8.0% |
t | 17847 | 6.0% |
a | 16832 | 5.7% |
m | 16473 | 5.6% |
y | 10176 | 3.4% |
u | 10091 | 3.4% |
l | 9066 | 3.1% |
Other values (10) | 36460 |
Uppercase Letter
Value | Count | Frequency (%) |
C | 8728 | |
N | 6382 | |
W | 893 | 5.5% |
O | 138 | 0.8% |
S | 64 | 0.4% |
F | 52 | 0.3% |
D | 44 | 0.3% |
L | 21 | 0.1% |
Space Separator
Value | Count | Frequency (%) |
24247 |
Open Punctuation
Value | Count | Frequency (%) |
( | 8792 |
Close Punctuation
Value | Count | Frequency (%) |
) | 8792 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 311826 | |
Common | 41831 | 11.8% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
r | 70418 | |
e | 60131 | |
d | 24442 | 7.8% |
i | 23568 | 7.6% |
t | 17847 | 5.7% |
a | 16832 | 5.4% |
m | 16473 | 5.3% |
y | 10176 | 3.3% |
u | 10091 | 3.2% |
l | 9066 | 2.9% |
Other values (18) | 52782 |
Common
Value | Count | Frequency (%) |
24247 | ||
( | 8792 | 21.0% |
) | 8792 | 21.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 353657 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
r | 70418 | |
e | 60131 | |
d | 24442 | 6.9% |
24247 | 6.9% | |
i | 23568 | 6.7% |
t | 17847 | 5.0% |
a | 16832 | 4.8% |
m | 16473 | 4.7% |
y | 10176 | 2.9% |
u | 10091 | 2.9% |
Other values (21) | 79432 |
current_attendance_in_any_education_instituition
Categorical
Imbalance  Missing 
Distinct | 8 |
---|---|
Distinct (%) | 0.1% |
Missing | 418 |
Missing (%) | 2.6% |
Memory size | 127.2 KiB |
Does not attend | |
---|---|
School | |
University | 560 |
Preschool | 298 |
Other educational institution | 233 |
Other values (3) | 362 |
Length
Max length | 34 |
---|---|
Median length | 15 |
Mean length | 13.564219 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Does not attend |
---|---|
2nd row | Does not attend |
3rd row | Does not attend |
4th row | Does not attend |
5th row | Does not attend |
Common Values
Value | Count | Frequency (%) |
Does not attend | 11435 | |
School | 2964 | 18.2% |
University | 560 | 3.4% |
Preschool | 298 | 1.8% |
Other educational institution | 233 | 1.4% |
Vocational/Technical Institution | 191 | 1.2% |
Pending results G.C.E. (O.L / A.L) | 105 | 0.6% |
Still a toddler | 66 | 0.4% |
(Missing) | 418 | 2.6% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
does | 11435 | |
not | 11435 | |
attend | 11435 | |
school | 2964 | 7.4% |
university | 560 | 1.4% |
institution | 424 | 1.1% |
preschool | 298 | 0.7% |
other | 233 | 0.6% |
educational | 233 | 0.6% |
vocational/technical | 191 | 0.5% |
Other values (9) | 828 | 2.1% |
Most occurring characters
Value | Count | Frequency (%) |
t | 37031 | |
o | 30499 | |
n | 25103 | |
e | 24661 | |
24184 | ||
s | 12927 | 6.0% |
a | 12540 | 5.8% |
d | 11905 | 5.5% |
D | 11435 | 5.3% |
l | 4180 | 1.9% |
Other values (24) | 20555 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 172836 | |
Space Separator | 24184 | 11.2% |
Uppercase Letter | 16969 | 7.9% |
Other Punctuation | 821 | 0.4% |
Open Punctuation | 105 | < 0.1% |
Close Punctuation | 105 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 37031 | |
o | 30499 | |
n | 25103 | |
e | 24661 | |
s | 12927 | 7.5% |
a | 12540 | 7.3% |
d | 11905 | 6.9% |
l | 4180 | 2.4% |
c | 4068 | 2.4% |
h | 3686 | 2.1% |
Other values (6) | 6236 | 3.6% |
Uppercase Letter
Value | Count | Frequency (%) |
D | 11435 | |
S | 3030 | 17.9% |
U | 560 | 3.3% |
P | 403 | 2.4% |
O | 338 | 2.0% |
L | 210 | 1.2% |
V | 191 | 1.1% |
T | 191 | 1.1% |
I | 191 | 1.1% |
C | 105 | 0.6% |
Other values (3) | 315 | 1.9% |
Other Punctuation
Value | Count | Frequency (%) |
. | 525 | |
/ | 296 |
Space Separator
Value | Count | Frequency (%) |
24184 |
Open Punctuation
Value | Count | Frequency (%) |
( | 105 |
Close Punctuation
Value | Count | Frequency (%) |
) | 105 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 189805 | |
Common | 25215 | 11.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
t | 37031 | |
o | 30499 | |
n | 25103 | |
e | 24661 | |
s | 12927 | 6.8% |
a | 12540 | 6.6% |
d | 11905 | 6.3% |
D | 11435 | 6.0% |
l | 4180 | 2.2% |
c | 4068 | 2.1% |
Other values (19) | 15456 |
Common
Value | Count | Frequency (%) |
24184 | ||
. | 525 | 2.1% |
/ | 296 | 1.2% |
( | 105 | 0.4% |
) | 105 | 0.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 215020 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
t | 37031 | |
o | 30499 | |
n | 25103 | |
e | 24661 | |
24184 | ||
s | 12927 | 6.0% |
a | 12540 | 5.8% |
d | 11905 | 5.5% |
D | 11435 | 5.3% |
l | 4180 | 1.9% |
Other values (24) | 20555 |
highest_level_of_education
Categorical
Missing 
Distinct | 20 |
---|---|
Distinct (%) | 0.1% |
Missing | 775 |
Missing (%) | 4.8% |
Memory size | 127.2 KiB |
Passed G.C.E.(A/L) or equivalent | |
---|---|
Passed G.C.E.(O/L) | |
Passed Grade 10 | |
Passed Degree / Diploma | |
Passed Grade 12 | |
Other values (15) |
Length
Max length | 104 |
---|---|
Median length | 35 |
Mean length | 20.53314 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Passed G.C.E.(A/L) or equivalent |
---|---|
2nd row | Passed G.C.E.(A/L) or equivalent |
3rd row | Passed post Graduate Degree / Diploma |
4th row | Passed post Graduate Degree / Diploma |
5th row | Passed Grade 6 |
Common Values
Value | Count | Frequency (%) |
Passed G.C.E.(A/L) or equivalent | 2989 | |
Passed G.C.E.(O/L) | 2697 | |
Passed Grade 10 | 2484 | |
Passed Degree / Diploma | 1333 | |
Passed Grade 12 | 1219 | |
Passed Grade 8 | 772 | 4.7% |
Passed Grade 9 | 638 | 3.9% |
Passed Grade 5 | 567 | 3.5% |
Passed Grade 7 | 509 | 3.1% |
Passed Grade 6 | 407 | 2.5% |
Other values (10) | 1880 | |
(Missing) | 775 | 4.8% |
Length
Value | Count | Frequency (%) |
passed | 15010 | |
grade | 7737 | |
g.c.e.(a/l | 2989 | 6.0% |
or | 2989 | 6.0% |
equivalent | 2989 | 6.0% |
g.c.e.(o/l | 2697 | 5.4% |
10 | 2484 | 5.0% |
1610 | 3.2% | |
degree | 1568 | 3.1% |
diploma | 1568 | 3.1% |
Other values (28) | 8170 |
Most occurring characters
Value | Count | Frequency (%) |
e | 35229 | 11.1% |
34316 | 10.8% | |
s | 30417 | 9.6% |
a | 29400 | 9.2% |
d | 23453 | 7.4% |
. | 17058 | 5.4% |
P | 15094 | 4.7% |
G | 14306 | 4.5% |
r | 13261 | 4.2% |
/ | 7601 | 2.4% |
Other values (37) | 98026 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 177616 | |
Uppercase Letter | 57786 | 18.2% |
Space Separator | 34316 | 10.8% |
Other Punctuation | 24659 | 7.8% |
Decimal Number | 11440 | 3.6% |
Close Punctuation | 6172 | 1.9% |
Open Punctuation | 6172 | 1.9% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 35229 | |
s | 30417 | |
a | 29400 | |
d | 23453 | |
r | 13261 | 7.5% |
i | 6643 | 3.7% |
o | 5806 | 3.3% |
l | 5709 | 3.2% |
n | 5136 | 2.9% |
t | 4477 | 2.5% |
Other values (11) | 18085 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 15094 | |
G | 14306 | |
E | 6052 | |
C | 5686 | 9.8% |
L | 5686 | 9.8% |
A | 3313 | 5.7% |
D | 3220 | 5.6% |
O | 2697 | 4.7% |
S | 868 | 1.5% |
Q | 648 | 1.1% |
Decimal Number
Value | Count | Frequency (%) |
1 | 4007 | |
0 | 2484 | |
2 | 1452 | 12.7% |
8 | 772 | 6.7% |
9 | 638 | 5.6% |
5 | 567 | 5.0% |
7 | 509 | 4.4% |
6 | 407 | 3.6% |
4 | 327 | 2.9% |
3 | 277 | 2.4% |
Other Punctuation
Value | Count | Frequency (%) |
. | 17058 | |
/ | 7601 |
Space Separator
Value | Count | Frequency (%) |
34316 |
Close Punctuation
Value | Count | Frequency (%) |
) | 6172 |
Open Punctuation
Value | Count | Frequency (%) |
( | 6172 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 235402 | |
Common | 82759 | 26.0% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 35229 | |
s | 30417 | |
a | 29400 | |
d | 23453 | |
P | 15094 | 6.4% |
G | 14306 | 6.1% |
r | 13261 | 5.6% |
i | 6643 | 2.8% |
E | 6052 | 2.6% |
o | 5806 | 2.5% |
Other values (22) | 55741 |
Common
Value | Count | Frequency (%) |
34316 | ||
. | 17058 | |
/ | 7601 | 9.2% |
) | 6172 | 7.5% |
( | 6172 | 7.5% |
1 | 4007 | 4.8% |
0 | 2484 | 3.0% |
2 | 1452 | 1.8% |
8 | 772 | 0.9% |
9 | 638 | 0.8% |
Other values (5) | 2087 | 2.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 318161 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 35229 | 11.1% |
34316 | 10.8% | |
s | 30417 | 9.6% |
a | 29400 | 9.2% |
d | 23453 | 7.4% |
. | 17058 | 5.4% |
P | 15094 | 4.7% |
G | 14306 | 4.5% |
r | 13261 | 4.2% |
/ | 7601 | 2.4% |
Other values (37) | 98026 |
main_activity_engaged_in
Categorical
High correlation  Missing 
Distinct | 10 |
---|---|
Distinct (%) | 0.1% |
Missing | 2133 |
Missing (%) | 13.1% |
Memory size | 127.2 KiB |
Engaged in economic activity/ currently employed/ engaged in own business | |
---|---|
Household activities | |
Student | |
Too old / Disable/ unable to work | |
Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | |
Other values (5) |
Length
Max length | 221 |
---|---|
Median length | 169 |
Mean length | 54.345052 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) |
---|---|
2nd row | Household activities |
3rd row | Engaged in economic activity/ currently employed/ engaged in own business |
4th row | Engaged in economic activity/ currently employed/ engaged in own business |
5th row | Retired - Obtaining government/semi-government pension payment and currently not engaged in economic activity (not employed elsewhere or not engaged in any own business) |
Common Values
Value | Count | Frequency (%) |
Engaged in economic activity/ currently employed/ engaged in own business | 5473 | |
Household activities | 3465 | |
Student | 2513 | |
Too old / Disable/ unable to work | 859 | 5.3% |
Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | 707 | 4.3% |
Retired - Obtaining government/semi-government pension payment and currently not engaged in economic activity (not employed elsewhere or not engaged in any own business) | 364 | 2.2% |
Retired from the private/semi-government sector and does not obtain any pension payment | 266 | 1.6% |
Seeking for and available to work | 215 | 1.3% |
Received other pension payments | 159 | 1.0% |
Other | 116 | 0.7% |
(Missing) | 2133 | 13.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
engaged | 13088 | 12.4% |
in | 13088 | 12.4% |
economic | 6544 | 6.2% |
activity | 6544 | 6.2% |
currently | 6544 | 6.2% |
employed | 6544 | 6.2% |
own | 6544 | 6.2% |
business | 6544 | 6.2% |
household | 3465 | 3.3% |
activities | 3465 | 3.3% |
Other values (36) | 33339 |
Most occurring characters
Value | Count | Frequency (%) |
91572 | ||
e | 90925 | 11.8% |
n | 74726 | 9.7% |
i | 61486 | 8.0% |
o | 47723 | 6.2% |
t | 44499 | 5.8% |
s | 34844 | 4.5% |
a | 32862 | 4.3% |
c | 31480 | 4.1% |
g | 30577 | 4.0% |
Other values (24) | 227582 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 642086 | |
Space Separator | 91572 | 11.9% |
Other Punctuation | 15415 | 2.0% |
Uppercase Letter | 15360 | 2.0% |
Dash Punctuation | 1701 | 0.2% |
Close Punctuation | 1071 | 0.1% |
Open Punctuation | 1071 | 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 90925 | |
n | 74726 | |
i | 61486 | 9.6% |
o | 47723 | 7.4% |
t | 44499 | 6.9% |
s | 34844 | 5.4% |
a | 32862 | 5.1% |
c | 31480 | 4.9% |
g | 30577 | 4.8% |
d | 30490 | 4.7% |
Other values (12) | 162474 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 5473 | |
H | 3465 | |
S | 2728 | |
R | 1496 | 9.7% |
T | 859 | 5.6% |
D | 859 | 5.6% |
O | 480 | 3.1% |
Space Separator
Value | Count | Frequency (%) |
91572 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 15415 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1701 |
Close Punctuation
Value | Count | Frequency (%) |
) | 1071 |
Open Punctuation
Value | Count | Frequency (%) |
( | 1071 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 657446 | |
Common | 110830 | 14.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 90925 | |
n | 74726 | |
i | 61486 | 9.4% |
o | 47723 | 7.3% |
t | 44499 | 6.8% |
s | 34844 | 5.3% |
a | 32862 | 5.0% |
c | 31480 | 4.8% |
g | 30577 | 4.7% |
d | 30490 | 4.6% |
Other values (19) | 177834 |
Common
Value | Count | Frequency (%) |
91572 | ||
/ | 15415 | 13.9% |
- | 1701 | 1.5% |
) | 1071 | 1.0% |
( | 1071 | 1.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 768276 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
91572 | ||
e | 90925 | 11.8% |
n | 74726 | 9.7% |
i | 61486 | 8.0% |
o | 47723 | 6.2% |
t | 44499 | 5.8% |
s | 34844 | 4.5% |
a | 32862 | 4.3% |
c | 31480 | 4.1% |
g | 30577 | 4.0% |
Other values (24) | 227582 |
main_occupation
Categorical
Missing 
Distinct | 11 |
---|---|
Distinct (%) | 0.2% |
Missing | 9902 |
Missing (%) | 60.9% |
Memory size | 127.2 KiB |
Service worker and shop and market sales worker | |
---|---|
Professional | |
Elementary occupation | |
Clerk | |
Technician and associate professional | |
Other values (6) |
Length
Max length | 47 |
---|---|
Median length | 39 |
Mean length | 29.174466 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Related to forces |
---|---|
2nd row | Professional |
3rd row | Professional |
4th row | Craft and related worker |
5th row | Clerk |
Common Values
Value | Count | Frequency (%) |
Service worker and shop and market sales worker | 1777 | 10.9% |
Professional | 1272 | 7.8% |
Elementary occupation | 605 | 3.7% |
Clerk | 527 | 3.2% |
Technician and associate professional | 501 | 3.1% |
Legislator, senior official, and manager | 469 | 2.9% |
Skilled agricultural and fishery worker | 333 | 2.0% |
No occupation | 296 | 1.8% |
Craft and related worker | 271 | 1.7% |
Plant and machine operator and assembler | 245 | 1.5% |
(Missing) | 9902 |
Length
Value | Count | Frequency (%) |
and | 5618 | |
worker | 4158 | |
service | 1777 | 6.7% |
shop | 1777 | 6.7% |
market | 1777 | 6.7% |
sales | 1777 | 6.7% |
professional | 1773 | 6.7% |
occupation | 901 | 3.4% |
elementary | 605 | 2.3% |
clerk | 527 | 2.0% |
Other values (18) | 5911 |
Most occurring characters
Value | Count | Frequency (%) |
20233 | ||
e | 19589 | |
r | 18530 | 10.0% |
a | 18090 | 9.7% |
o | 14121 | 7.6% |
s | 11712 | 6.3% |
n | 11327 | 6.1% |
i | 9074 | 4.9% |
l | 7785 | 4.2% |
k | 6795 | 3.7% |
Other values (22) | 48527 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 158244 | |
Space Separator | 20233 | 10.9% |
Uppercase Letter | 6368 | 3.4% |
Other Punctuation | 938 | 0.5% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 19589 | |
r | 18530 | |
a | 18090 | |
o | 14121 | |
s | 11712 | 7.4% |
n | 11327 | 7.2% |
i | 9074 | 5.7% |
l | 7785 | 4.9% |
k | 6795 | 4.3% |
d | 6294 | 4.0% |
Other values (12) | 34927 |
Uppercase Letter
Value | Count | Frequency (%) |
S | 2110 | |
P | 1517 | |
C | 798 | 12.5% |
E | 605 | 9.5% |
T | 501 | 7.9% |
L | 469 | 7.4% |
N | 296 | 4.6% |
R | 72 | 1.1% |
Space Separator
Value | Count | Frequency (%) |
20233 |
Other Punctuation
Value | Count | Frequency (%) |
, | 938 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 164612 | |
Common | 21171 | 11.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 19589 | |
r | 18530 | |
a | 18090 | |
o | 14121 | 8.6% |
s | 11712 | 7.1% |
n | 11327 | 6.9% |
i | 9074 | 5.5% |
l | 7785 | 4.7% |
k | 6795 | 4.1% |
d | 6294 | 3.8% |
Other values (20) | 41295 |
Common
Value | Count | Frequency (%) |
20233 | ||
, | 938 | 4.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 185783 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
20233 | ||
e | 19589 | |
r | 18530 | 10.0% |
a | 18090 | 9.7% |
o | 14121 | 7.6% |
s | 11712 | 6.3% |
n | 11327 | 6.1% |
i | 9074 | 4.9% |
l | 7785 | 4.2% |
k | 6795 | 3.7% |
Other values (22) | 48527 |
daily_wage_owner_or_not
Boolean
Missing 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 10090 |
Missing (%) | 62.0% |
Memory size | 31.9 KiB |
False | |
---|---|
True | |
(Missing) |
Value | Count | Frequency (%) |
False | 4064 | |
True | 2116 | 13.0% |
(Missing) | 10090 |
employment_status_of_the_main_occupation
Categorical
High correlation  Missing 
Distinct | 6 |
---|---|
Distinct (%) | 0.1% |
Missing | 9902 |
Missing (%) | 60.9% |
Memory size | 127.2 KiB |
Private sector employee | |
---|---|
Own account worker | |
Government employee | |
Employer | |
Semi government employee | 159 |
Length
Max length | 26 |
---|---|
Median length | 23 |
Mean length | 20.7288 |
Min length | 8 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Government employee |
---|---|
2nd row | Government employee |
3rd row | Government employee |
4th row | Private sector employee |
5th row | Government employee |
Common Values
Value | Count | Frequency (%) |
Private sector employee | 3698 | 22.7% |
Own account worker | 1084 | 6.7% |
Government employee | 859 | 5.3% |
Employer | 415 | 2.6% |
Semi government employee | 159 | 1.0% |
Contributing family worker | 153 | 0.9% |
(Missing) | 9902 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
employee | 4716 | |
private | 3698 | |
sector | 3698 | |
worker | 1237 | 7.1% |
own | 1084 | 6.2% |
account | 1084 | 6.2% |
government | 1018 | 5.8% |
employer | 415 | 2.4% |
semi | 159 | 0.9% |
contributing | 153 | 0.9% |
Most occurring characters
Value | Count | Frequency (%) |
e | 25391 | |
o | 12321 | 9.3% |
r | 11456 | 8.7% |
11047 | 8.4% | |
t | 9804 | 7.4% |
m | 6461 | 4.9% |
c | 5866 | 4.4% |
l | 5284 | 4.0% |
y | 5284 | 4.0% |
p | 5131 | 3.9% |
Other values (17) | 33956 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 114586 | |
Space Separator | 11047 | 8.4% |
Uppercase Letter | 6368 | 4.8% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 25391 | |
o | 12321 | |
r | 11456 | |
t | 9804 | 8.6% |
m | 6461 | 5.6% |
c | 5866 | 5.1% |
l | 5284 | 4.6% |
y | 5284 | 4.6% |
p | 5131 | 4.5% |
a | 4935 | 4.3% |
Other values (10) | 22653 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 3698 | |
O | 1084 | 17.0% |
G | 859 | 13.5% |
E | 415 | 6.5% |
S | 159 | 2.5% |
C | 153 | 2.4% |
Space Separator
Value | Count | Frequency (%) |
11047 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 120954 | |
Common | 11047 | 8.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 25391 | |
o | 12321 | |
r | 11456 | 9.5% |
t | 9804 | 8.1% |
m | 6461 | 5.3% |
c | 5866 | 4.8% |
l | 5284 | 4.4% |
y | 5284 | 4.4% |
p | 5131 | 4.2% |
a | 4935 | 4.1% |
Other values (16) | 29021 |
Common
Value | Count | Frequency (%) |
11047 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 132001 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 25391 | |
o | 12321 | 9.3% |
r | 11456 | 8.7% |
11047 | 8.4% | |
t | 9804 | 7.4% |
m | 6461 | 4.9% |
c | 5866 | 4.4% |
l | 5284 | 4.0% |
y | 5284 | 4.0% |
p | 5131 | 3.9% |
Other values (17) | 33956 |
no_of_hours_stayed_at_home_during_last_week
Real number (ℝ)
Zeros 
Distinct | 326 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 124.84774 |
Minimum | 0 |
---|---|
Maximum | 168 |
Zeros | 699 |
Zeros (%) | 4.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 127.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 10 |
Q1 | 96 |
median | 140 |
Q3 | 168 |
95-th percentile | 168 |
Maximum | 168 |
Range | 168 |
Interquartile range (IQR) | 72 |
Descriptive statistics
Standard deviation | 47.768165 |
---|---|
Coefficient of variation (CV) | 0.38261138 |
Kurtosis | 0.29131003 |
Mean | 124.84774 |
Median Absolute Deviation (MAD) | 28 |
Skewness | -1.0383694 |
Sum | 2031272.7 |
Variance | 2281.7976 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
168 | 5380 | |
84 | 761 | 4.7% |
0 | 699 | 4.3% |
160 | 449 | 2.8% |
150 | 439 | 2.7% |
120 | 422 | 2.6% |
140 | 360 | 2.2% |
100 | 333 | 2.0% |
108 | 301 | 1.9% |
96 | 282 | 1.7% |
Other values (316) | 6844 |
Value | Count | Frequency (%) |
0 | 699 | |
0.142 | 1 | < 0.1% |
0.147 | 1 | < 0.1% |
0.159 | 1 | < 0.1% |
0.168 | 1 | < 0.1% |
0.25 | 1 | < 0.1% |
0.3 | 1 | < 0.1% |
1 | 13 | 0.1% |
2 | 13 | 0.1% |
2.3 | 2 | < 0.1% |
Value | Count | Frequency (%) |
168 | 5380 | |
167.5 | 1 | < 0.1% |
167.3 | 1 | < 0.1% |
167.25 | 1 | < 0.1% |
167 | 36 | 0.2% |
166.5 | 1 | < 0.1% |
166 | 87 | 0.5% |
165.9 | 1 | < 0.1% |
165.75 | 1 | < 0.1% |
165.7 | 1 | < 0.1% |
member_went_out_for_work_or_not_during_last_week
Categorical
Missing 
Distinct | 3 |
---|---|
Distinct (%) | 0.1% |
Missing | 11967 |
Missing (%) | 73.6% |
Memory size | 127.2 KiB |
Yes, went daily during working days | |
---|---|
No, worked from home | |
Yes, went on most of the days |
Length
Max length | 35 |
---|---|
Median length | 35 |
Mean length | 31.061585 |
Min length | 20 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | No, worked from home |
---|---|
2nd row | Yes, went daily during working days |
3rd row | Yes, went daily during working days |
4th row | Yes, went on most of the days |
5th row | No, worked from home |
Common Values
Value | Count | Frequency (%) |
Yes, went daily during working days | 2764 | 17.0% |
No, worked from home | 857 | 5.3% |
Yes, went on most of the days | 682 | 4.2% |
(Missing) | 11967 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
yes | 3446 | |
went | 3446 | |
days | 3446 | |
daily | 2764 | |
during | 2764 | |
working | 2764 | |
no | 857 | 3.5% |
worked | 857 | 3.5% |
from | 857 | 3.5% |
home | 857 | 3.5% |
Other values (4) | 2728 |
Most occurring characters
Value | Count | Frequency (%) |
20483 | ||
d | 9831 | 7.4% |
n | 9656 | 7.2% |
e | 9288 | 6.9% |
i | 8292 | 6.2% |
o | 8238 | 6.2% |
s | 7574 | 5.7% |
r | 7242 | 5.4% |
w | 7067 | 5.3% |
a | 6210 | 4.6% |
Other values (12) | 39777 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 104569 | |
Space Separator | 20483 | 15.3% |
Other Punctuation | 4303 | 3.2% |
Uppercase Letter | 4303 | 3.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
d | 9831 | 9.4% |
n | 9656 | 9.2% |
e | 9288 | 8.9% |
i | 8292 | 7.9% |
o | 8238 | 7.9% |
s | 7574 | 7.2% |
r | 7242 | 6.9% |
w | 7067 | 6.8% |
a | 6210 | 5.9% |
y | 6210 | 5.9% |
Other values (8) | 24961 |
Uppercase Letter
Value | Count | Frequency (%) |
Y | 3446 | |
N | 857 | 19.9% |
Space Separator
Value | Count | Frequency (%) |
20483 |
Other Punctuation
Value | Count | Frequency (%) |
, | 4303 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 108872 | |
Common | 24786 | 18.5% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
d | 9831 | 9.0% |
n | 9656 | 8.9% |
e | 9288 | 8.5% |
i | 8292 | 7.6% |
o | 8238 | 7.6% |
s | 7574 | 7.0% |
r | 7242 | 6.7% |
w | 7067 | 6.5% |
a | 6210 | 5.7% |
y | 6210 | 5.7% |
Other values (10) | 29264 |
Common
Value | Count | Frequency (%) |
20483 | ||
, | 4303 | 17.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 133658 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
20483 | ||
d | 9831 | 7.4% |
n | 9656 | 7.2% |
e | 9288 | 6.9% |
i | 8292 | 6.2% |
o | 8238 | 6.2% |
s | 7574 | 5.7% |
r | 7242 | 5.4% |
w | 7067 | 5.3% |
a | 6210 | 4.6% |
Other values (12) | 39777 |
Interactions
Correlations
age | current_attendance_in_any_education_instituition | daily_wage_owner_or_not | employment_status_of_the_main_occupation | ethnicity | gender | highest_level_of_education | main_activity_engaged_in | main_occupation | marital_status | member_ID | member_went_out_for_work_or_not_during_last_week | no_of_hours_stayed_at_home_during_last_week | relationship_to_the_head_of_household | religion | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
age | 1.000 | 0.398 | 0.125 | 0.156 | 0.051 | 0.059 | 0.302 | 0.357 | 0.087 | 0.319 | 0.253 | 0.159 | 0.142 | 0.346 | 0.045 |
current_attendance_in_any_education_instituition | 0.398 | 1.000 | 0.077 | 0.049 | 0.033 | 0.037 | 0.257 | 0.327 | 0.059 | 0.276 | 0.222 | 0.038 | 0.096 | 0.280 | 0.030 |
daily_wage_owner_or_not | 0.125 | 0.077 | 1.000 | 0.428 | 0.107 | 0.135 | 0.331 | 0.123 | 0.285 | 0.087 | 0.107 | 0.118 | 0.125 | 0.113 | 0.112 |
employment_status_of_the_main_occupation | 0.156 | 0.049 | 0.428 | 1.000 | 0.060 | 0.147 | 0.156 | 0.527 | 0.339 | 0.113 | 0.122 | 0.217 | 0.112 | 0.145 | 0.075 |
ethnicity | 0.051 | 0.033 | 0.107 | 0.060 | 1.000 | 0.000 | 0.064 | 0.051 | 0.047 | 0.036 | 0.055 | 0.000 | 0.020 | 0.063 | 0.512 |
gender | 0.059 | 0.037 | 0.135 | 0.147 | 0.000 | 1.000 | 0.042 | 0.473 | 0.209 | 0.162 | 0.479 | 0.124 | 0.283 | 0.526 | 0.000 |
highest_level_of_education | 0.302 | 0.257 | 0.331 | 0.156 | 0.064 | 0.042 | 1.000 | 0.143 | 0.210 | 0.115 | 0.097 | 0.106 | 0.066 | 0.123 | 0.065 |
main_activity_engaged_in | 0.357 | 0.327 | 0.123 | 0.527 | 0.051 | 0.473 | 0.143 | 1.000 | 0.454 | 0.259 | 0.226 | 0.177 | 0.199 | 0.307 | 0.053 |
main_occupation | 0.087 | 0.059 | 0.285 | 0.339 | 0.047 | 0.209 | 0.210 | 0.454 | 1.000 | 0.074 | 0.062 | 0.182 | 0.067 | 0.088 | 0.054 |
marital_status | 0.319 | 0.276 | 0.087 | 0.113 | 0.036 | 0.162 | 0.115 | 0.259 | 0.074 | 1.000 | 0.240 | 0.073 | 0.070 | 0.331 | 0.033 |
member_ID | 0.253 | 0.222 | 0.107 | 0.122 | 0.055 | 0.479 | 0.097 | 0.226 | 0.062 | 0.240 | 1.000 | 0.017 | 0.083 | 0.401 | 0.049 |
member_went_out_for_work_or_not_during_last_week | 0.159 | 0.038 | 0.118 | 0.217 | 0.000 | 0.124 | 0.106 | 0.177 | 0.182 | 0.073 | 0.017 | 1.000 | 0.420 | 0.121 | 0.047 |
no_of_hours_stayed_at_home_during_last_week | 0.142 | 0.096 | 0.125 | 0.112 | 0.020 | 0.283 | 0.066 | 0.199 | 0.067 | 0.070 | 0.083 | 0.420 | 1.000 | 0.125 | 0.033 |
relationship_to_the_head_of_household | 0.346 | 0.280 | 0.113 | 0.145 | 0.063 | 0.526 | 0.123 | 0.307 | 0.088 | 0.331 | 0.401 | 0.121 | 0.125 | 1.000 | 0.071 |
religion | 0.045 | 0.030 | 0.112 | 0.075 | 0.512 | 0.000 | 0.065 | 0.053 | 0.054 | 0.033 | 0.049 | 0.047 | 0.033 | 0.071 | 1.000 |
Missing values
Sample
household_ID | member_ID | age | relationship_to_the_head_of_household | gender | ethnicity | religion | marital_status | current_attendance_in_any_education_instituition | highest_level_of_education | main_activity_engaged_in | main_occupation | daily_wage_owner_or_not | employment_status_of_the_main_occupation | no_of_hours_stayed_at_home_during_last_week | member_went_out_for_work_or_not_during_last_week | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | ID0001 | I1 | 71 | Head of the household | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(A/L) or equivalent | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | Related to forces | No | Government employee | 168.0 | No, worked from home |
1 | ID0001 | I2 | 66 | Wife/Husband | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(A/L) or equivalent | Household activities | NaN | NaN | NaN | 168.0 | NaN |
2 | ID0001 | I3 | 32 | Son/daughter | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed post Graduate Degree / Diploma | Engaged in economic activity/ currently employed/ engaged in own business | Professional | No | Government employee | 70.0 | Yes, went daily during working days |
3 | ID0001 | I4 | 30 | Son-in-law/Daughter in law | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed post Graduate Degree / Diploma | Engaged in economic activity/ currently employed/ engaged in own business | Professional | No | Government employee | 150.0 | Yes, went daily during working days |
4 | ID0002 | I1 | 85 | Head of the household | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed Grade 6 | Retired - Obtaining government/semi-government pension payment and currently not engaged in economic activity (not employed elsewhere or not engaged in any own business) | NaN | NaN | NaN | 168.0 | NaN |
5 | ID0002 | I2 | 66 | Parents of the head of the Household/ spouse | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(A/L) or equivalent | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | Craft and related worker | No | Private sector employee | 0.0 | Yes, went on most of the days |
6 | ID0002 | I3 | 59 | Son/daughter | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(A/L) or equivalent | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | Clerk | No | Government employee | 168.0 | No, worked from home |
7 | ID0003 | I1 | 44 | Head of the household | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed Degree / Diploma | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | Professional | No | Government employee | 100.0 | Yes, went daily during working days |
8 | ID0003 | I2 | 41 | Wife/Husband | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed post Graduate Degree / Diploma | Retired and obtaining government/semi-government pension payment and is currently engaged in economic activity (employed elsewhere other than the place where he/she is receiving the pension from / engaged in own business) | Professional | No | Government employee | 100.0 | Yes, went daily during working days |
9 | ID0003 | I3 | 74 | Parents of the head of the Household/ spouse | Female | Sinhala | Buddhism | Widowed | Does not attend | Passed Degree / Diploma | Retired - Obtaining government/semi-government pension payment and currently not engaged in economic activity (not employed elsewhere or not engaged in any own business) | NaN | NaN | NaN | 168.0 | NaN |
household_ID | member_ID | age | relationship_to_the_head_of_household | gender | ethnicity | religion | marital_status | current_attendance_in_any_education_instituition | highest_level_of_education | main_activity_engaged_in | main_occupation | daily_wage_owner_or_not | employment_status_of_the_main_occupation | no_of_hours_stayed_at_home_during_last_week | member_went_out_for_work_or_not_during_last_week | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
16260 | ID4060 | I1 | 78 | Head of the household | Female | Sinhala | Buddhism | Never married | Does not attend | Passed Grade 10 | Household activities | NaN | NaN | NaN | 130.0 | NaN |
16261 | ID4061 | I1 | 82 | Head of the household | Female | Sinhala | Buddhism | Widowed | Does not attend | Passed G.C.E.(O/L) | Household activities | NaN | NaN | NaN | 168.0 | NaN |
16262 | ID4061 | I2 | 53 | Son/daughter | Male | Sinhala | Buddhism | Never married | Does not attend | Passed G.C.E.(O/L) | Engaged in economic activity/ currently employed/ engaged in own business | Elementary occupation | Yes | Private sector employee | 70.0 | NaN |
16263 | ID4062 | I1 | 73 | Head of the household | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(O/L) | Engaged in economic activity/ currently employed/ engaged in own business | Service worker and shop and market sales worker | Yes | Own account worker | 168.0 | NaN |
16264 | ID4062 | I2 | 66 | Wife/Husband | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(O/L) | Household activities | NaN | NaN | NaN | 168.0 | NaN |
16265 | ID4063 | I1 | 62 | Head of the household | Female | Sinhala | Buddhism | Widowed | Does not attend | Passed Grade 10 | Household activities | NaN | NaN | NaN | 48.0 | NaN |
16266 | ID4063 | I2 | 49 | Son-in-law/Daughter in law | Male | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed Grade 10 | Engaged in economic activity/ currently employed/ engaged in own business | Service worker and shop and market sales worker | Yes | Private sector employee | 120.0 | NaN |
16267 | ID4063 | I3 | 42 | Son/daughter | Female | Sinhala | Buddhism | Currently married (registered) | Does not attend | Passed G.C.E.(A/L) or equivalent | Household activities | NaN | NaN | NaN | 48.0 | NaN |
16268 | ID4063 | I4 | 37 | Son/daughter | Male | Sinhala | Buddhism | Never married | Does not attend | Passed Grade 10 | Household activities | NaN | NaN | NaN | 168.0 | NaN |
16269 | ID4063 | I5 | 36 | Son/daughter | Male | Sinhala | Buddhism | Never married | Does not attend | Passed Grade 10 | Engaged in economic activity/ currently employed/ engaged in own business | Technician and associate professional | No | Private sector employee | 0.0 | NaN |