Overview

Brought to you by YData

Dataset statistics

Number of variables16
Number of observations16,270
Missing cells45,187
Missing cells (%)17.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 MiB
Average record size in memory128.0 B

Variable types

Text1
Categorical12
Numeric2
Boolean1

Alerts

employment_status_of_the_main_occupation is highly overall correlated with main_activity_engaged_inHigh correlation
ethnicity is highly overall correlated with religionHigh correlation
gender is highly overall correlated with relationship_to_the_head_of_householdHigh correlation
main_activity_engaged_in is highly overall correlated with employment_status_of_the_main_occupationHigh correlation
relationship_to_the_head_of_household is highly overall correlated with genderHigh correlation
religion is highly overall correlated with ethnicityHigh correlation
ethnicity is highly imbalanced (69.9%) Imbalance
current_attendance_in_any_education_instituition is highly imbalanced (56.1%) Imbalance
current_attendance_in_any_education_instituition has 418 (2.6%) missing values Missing
highest_level_of_education has 775 (4.8%) missing values Missing
main_activity_engaged_in has 2133 (13.1%) missing values Missing
main_occupation has 9902 (60.9%) missing values Missing
daily_wage_owner_or_not has 10090 (62.0%) missing values Missing
employment_status_of_the_main_occupation has 9902 (60.9%) missing values Missing
member_went_out_for_work_or_not_during_last_week has 11967 (73.6%) missing values Missing
no_of_hours_stayed_at_home_during_last_week has 699 (4.3%) zeros Zeros

Reproduction

Analysis started2024-12-06 05:54:26.223750
Analysis finished2024-12-06 05:54:28.327003
Duration2.1 seconds
Software versionydata-profiling vv4.11.0
Download configurationconfig.json

Variables

Distinct4063
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
2024-12-06T11:24:28.734983image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters97,620
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186 ?
Unique (%)1.1%

Sample

1st rowID0001
2nd rowID0001
3rd rowID0001
4th rowID0001
5th rowID0002
ValueCountFrequency (%)
id0349 13
 
0.1%
id3438 13
 
0.1%
id0849 12
 
0.1%
id1781 12
 
0.1%
id2880 12
 
0.1%
id0939 12
 
0.1%
id3013 11
 
0.1%
id0699 11
 
0.1%
id2896 11
 
0.1%
id2341 11
 
0.1%
Other values (4053) 16152
99.3%
2024-12-06T11:24:29.063692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 16270
16.7%
D 16270
16.7%
0 9274
9.5%
1 8926
9.1%
2 8804
9.0%
3 8734
8.9%
4 5068
 
5.2%
8 5009
 
5.1%
6 4859
 
5.0%
5 4827
 
4.9%
Other values (2) 9579
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 65080
66.7%
Uppercase Letter 32540
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9274
14.3%
1 8926
13.7%
2 8804
13.5%
3 8734
13.4%
4 5068
7.8%
8 5009
7.7%
6 4859
7.5%
5 4827
7.4%
7 4802
7.4%
9 4777
7.3%
Uppercase Letter
ValueCountFrequency (%)
I 16270
50.0%
D 16270
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 65080
66.7%
Latin 32540
33.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9274
14.3%
1 8926
13.7%
2 8804
13.5%
3 8734
13.4%
4 5068
7.8%
8 5009
7.7%
6 4859
7.5%
5 4827
7.4%
7 4802
7.4%
9 4777
7.3%
Latin
ValueCountFrequency (%)
I 16270
50.0%
D 16270
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 16270
16.7%
D 16270
16.7%
0 9274
9.5%
1 8926
9.1%
2 8804
9.0%
3 8734
8.9%
4 5068
 
5.2%
8 5009
 
5.1%
6 4859
 
5.0%
5 4827
 
4.9%
Other values (2) 9579
9.8%

member_ID
Categorical

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
I1
4063 
I2
3877 
I3
3275 
I4
2457 
I5
1443 
Other values (8)
1155 

Length

Max length3
Median length2
Mean length2.0027658
Min length2

Characters and Unicode

Total characters32,585
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI1
2nd rowI2
3rd rowI3
4th rowI4
5th rowI1

Common Values

ValueCountFrequency (%)
I1 4063
25.0%
I2 3877
23.8%
I3 3275
20.1%
I4 2457
15.1%
I5 1443
 
8.9%
I6 671
 
4.1%
I7 264
 
1.6%
I8 120
 
0.7%
I9 55
 
0.3%
I10 26
 
0.2%
Other values (3) 19
 
0.1%

Length

2024-12-06T11:24:29.176647image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i1 4063
25.0%
i2 3877
23.8%
i3 3275
20.1%
i4 2457
15.1%
i5 1443
 
8.9%
i6 671
 
4.1%
i7 264
 
1.6%
i8 120
 
0.7%
i9 55
 
0.3%
i10 26
 
0.2%
Other values (3) 19
 
0.1%

Most occurring characters

ValueCountFrequency (%)
I 16270
49.9%
1 4119
 
12.6%
2 3883
 
11.9%
3 3277
 
10.1%
4 2457
 
7.5%
5 1443
 
4.4%
6 671
 
2.1%
7 264
 
0.8%
8 120
 
0.4%
9 55
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16315
50.1%
Uppercase Letter 16270
49.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4119
25.2%
2 3883
23.8%
3 3277
20.1%
4 2457
15.1%
5 1443
 
8.8%
6 671
 
4.1%
7 264
 
1.6%
8 120
 
0.7%
9 55
 
0.3%
0 26
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
I 16270
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16315
50.1%
Latin 16270
49.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4119
25.2%
2 3883
23.8%
3 3277
20.1%
4 2457
15.1%
5 1443
 
8.8%
6 671
 
4.1%
7 264
 
1.6%
8 120
 
0.7%
9 55
 
0.3%
0 26
 
0.2%
Latin
ValueCountFrequency (%)
I 16270
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32585
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 16270
49.9%
1 4119
 
12.6%
2 3883
 
11.9%
3 3277
 
10.1%
4 2457
 
7.5%
5 1443
 
4.4%
6 671
 
2.1%
7 264
 
0.8%
8 120
 
0.4%
9 55
 
0.2%

age
Real number (ℝ)

Distinct97
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.39287
Minimum0
Maximum98
Zeros149
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size127.2 KiB
2024-12-06T11:24:29.281207image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q119
median38
Q356
95-th percentile75
Maximum98
Range98
Interquartile range (IQR)37

Descriptive statistics

Standard deviation22.075172
Coefficient of variation (CV)0.57498103
Kurtosis-0.99725102
Mean38.39287
Median Absolute Deviation (MAD)18
Skewness0.14195376
Sum624652
Variance487.31322
MonotonicityNot monotonic
2024-12-06T11:24:29.396076image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 282
 
1.7%
17 278
 
1.7%
23 276
 
1.7%
15 267
 
1.6%
18 266
 
1.6%
20 263
 
1.6%
16 256
 
1.6%
42 252
 
1.5%
22 251
 
1.5%
45 245
 
1.5%
Other values (87) 13634
83.8%
ValueCountFrequency (%)
0 149
0.9%
1 131
0.8%
2 138
0.8%
3 186
1.1%
4 171
1.1%
5 177
1.1%
6 163
1.0%
7 164
1.0%
8 196
1.2%
9 198
1.2%
ValueCountFrequency (%)
98 1
 
< 0.1%
96 1
 
< 0.1%
95 3
 
< 0.1%
93 8
 
< 0.1%
92 5
 
< 0.1%
91 7
 
< 0.1%
90 17
0.1%
89 20
0.1%
88 16
0.1%
87 14
0.1%

relationship_to_the_head_of_household
Categorical

High correlation 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
Son/daughter
5654 
Head of the household
4012 
Wife/Husband
3226 
Parents of the head of the Household/ spouse
1198 
Other relative
685 
Other values (7)
1495 

Length

Max length44
Median length12
Mean length17.506884
Min length5

Characters and Unicode

Total characters284,837
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHead of the household
2nd rowWife/Husband
3rd rowSon/daughter
4th rowSon-in-law/Daughter in law
5th rowHead of the household

Common Values

ValueCountFrequency (%)
Son/daughter 5654
34.8%
Head of the household 4012
24.7%
Wife/Husband 3226
19.8%
Parents of the head of the Household/ spouse 1198
 
7.4%
Other relative 685
 
4.2%
Grandson/ Granddaughter 666
 
4.1%
Son-in-law/Daughter in law 434
 
2.7%
Boarder 237
 
1.5%
Domestic servant/driver/watcher 101
 
0.6%
Other 52
 
0.3%
Other values (2) 5
 
< 0.1%

Length

2024-12-06T11:24:29.509304image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
of 6408
16.4%
the 6408
16.4%
son/daughter 5654
14.5%
head 5210
13.4%
household 5210
13.4%
wife/husband 3226
8.3%
parents 1198
 
3.1%
spouse 1198
 
3.1%
other 737
 
1.9%
relative 685
 
1.8%
Other values (13) 3086
7.9%

Most occurring characters

ValueCountFrequency (%)
e 31961
 
11.2%
o 25125
 
8.8%
h 24420
 
8.6%
22750
 
8.0%
d 21639
 
7.6%
a 19715
 
6.9%
u 16391
 
5.8%
t 16090
 
5.6%
n 13486
 
4.7%
s 12904
 
4.5%
Other values (24) 80356
28.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 228043
80.1%
Space Separator 22750
 
8.0%
Uppercase Letter 21794
 
7.7%
Other Punctuation 11382
 
4.0%
Dash Punctuation 868
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 31961
14.0%
o 25125
11.0%
h 24420
10.7%
d 21639
9.5%
a 19715
8.6%
u 16391
7.2%
t 16090
7.1%
n 13486
 
5.9%
s 12904
 
5.7%
r 11587
 
5.1%
Other values (11) 34725
15.2%
Uppercase Letter
ValueCountFrequency (%)
H 8436
38.7%
S 6088
27.9%
W 3226
 
14.8%
G 1332
 
6.1%
P 1198
 
5.5%
O 737
 
3.4%
D 537
 
2.5%
B 237
 
1.1%
R 3
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
/ 11380
> 99.9%
' 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
22750
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 868
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 249837
87.7%
Common 35000
 
12.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 31961
12.8%
o 25125
10.1%
h 24420
9.8%
d 21639
 
8.7%
a 19715
 
7.9%
u 16391
 
6.6%
t 16090
 
6.4%
n 13486
 
5.4%
s 12904
 
5.2%
r 11587
 
4.6%
Other values (20) 56519
22.6%
Common
ValueCountFrequency (%)
22750
65.0%
/ 11380
32.5%
- 868
 
2.5%
' 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 284837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 31961
 
11.2%
o 25125
 
8.8%
h 24420
 
8.6%
22750
 
8.0%
d 21639
 
7.6%
a 19715
 
6.9%
u 16391
 
5.8%
t 16090
 
5.6%
n 13486
 
4.7%
s 12904
 
4.5%
Other values (24) 80356
28.2%

gender
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
Female
8386 
Male
7884 

Length

Max length6
Median length6
Mean length5.0308543
Min length4

Characters and Unicode

Total characters81,852
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowMale
4th rowFemale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 8386
51.5%
Male 7884
48.5%

Length

2024-12-06T11:24:29.616487image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T11:24:29.703871image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
female 8386
51.5%
male 7884
48.5%

Most occurring characters

ValueCountFrequency (%)
e 24656
30.1%
a 16270
19.9%
l 16270
19.9%
F 8386
 
10.2%
m 8386
 
10.2%
M 7884
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65582
80.1%
Uppercase Letter 16270
 
19.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 24656
37.6%
a 16270
24.8%
l 16270
24.8%
m 8386
 
12.8%
Uppercase Letter
ValueCountFrequency (%)
F 8386
51.5%
M 7884
48.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 81852
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 24656
30.1%
a 16270
19.9%
l 16270
19.9%
F 8386
 
10.2%
m 8386
 
10.2%
M 7884
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 81852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 24656
30.1%
a 16270
19.9%
l 16270
19.9%
F 8386
 
10.2%
m 8386
 
10.2%
M 7884
 
9.6%

ethnicity
Categorical

High correlation  Imbalance 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
Sinhala
13560 
Sri Lankan Moor/Muslim
1968 
Sri Lankan Tamil
 
572
Indian Tamil
 
82
Malay
 
42
Other values (2)
 
46

Length

Max length22
Median length7
Mean length9.1491088
Min length5

Characters and Unicode

Total characters148,856
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSinhala
2nd rowSinhala
3rd rowSinhala
4th rowSinhala
5th rowSinhala

Common Values

ValueCountFrequency (%)
Sinhala 13560
83.3%
Sri Lankan Moor/Muslim 1968
 
12.1%
Sri Lankan Tamil 572
 
3.5%
Indian Tamil 82
 
0.5%
Malay 42
 
0.3%
Burgher 32
 
0.2%
Other 14
 
0.1%

Length

2024-12-06T11:24:29.801818image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T11:24:29.904960image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
sinhala 13560
63.3%
sri 2540
 
11.9%
lankan 2540
 
11.9%
moor/muslim 1968
 
9.2%
tamil 654
 
3.1%
indian 82
 
0.4%
malay 42
 
0.2%
burgher 32
 
0.1%
other 14
 
0.1%

Most occurring characters

ValueCountFrequency (%)
a 33020
22.2%
n 18804
12.6%
i 18804
12.6%
l 16224
10.9%
S 16100
10.8%
h 13606
9.1%
5162
 
3.5%
r 4586
 
3.1%
M 3978
 
2.7%
o 3936
 
2.6%
Other values (15) 14636
9.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 118326
79.5%
Uppercase Letter 23400
 
15.7%
Space Separator 5162
 
3.5%
Other Punctuation 1968
 
1.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 33020
27.9%
n 18804
15.9%
i 18804
15.9%
l 16224
13.7%
h 13606
11.5%
r 4586
 
3.9%
o 3936
 
3.3%
m 2622
 
2.2%
k 2540
 
2.1%
u 2000
 
1.7%
Other values (6) 2184
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
S 16100
68.8%
M 3978
 
17.0%
L 2540
 
10.9%
T 654
 
2.8%
I 82
 
0.4%
B 32
 
0.1%
O 14
 
0.1%
Space Separator
ValueCountFrequency (%)
5162
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1968
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 141726
95.2%
Common 7130
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 33020
23.3%
n 18804
13.3%
i 18804
13.3%
l 16224
11.4%
S 16100
11.4%
h 13606
9.6%
r 4586
 
3.2%
M 3978
 
2.8%
o 3936
 
2.8%
m 2622
 
1.9%
Other values (13) 10046
 
7.1%
Common
ValueCountFrequency (%)
5162
72.4%
/ 1968
 
27.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 148856
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 33020
22.2%
n 18804
12.6%
i 18804
12.6%
l 16224
10.9%
S 16100
10.8%
h 13606
9.1%
5162
 
3.5%
r 4586
 
3.1%
M 3978
 
2.7%
o 3936
 
2.6%
Other values (15) 14636
9.8%

religion
Categorical

High correlation 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
Buddhism
10807 
Roman Catholicism
2442 
Islam
2053 
Other Christian denominations
 
547
Hinduism
 
407
Other values (2)
 
14

Length

Max length29
Median length8
Mean length9.6779348
Min length5

Characters and Unicode

Total characters157,460
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBuddhism
2nd rowBuddhism
3rd rowBuddhism
4th rowBuddhism
5th rowBuddhism

Common Values

ValueCountFrequency (%)
Buddhism 10807
66.4%
Roman Catholicism 2442
 
15.0%
Islam 2053
 
12.6%
Other Christian denominations 547
 
3.4%
Hinduism 407
 
2.5%
Other 8
 
< 0.1%
No religion 6
 
< 0.1%

Length

2024-12-06T11:24:30.015864image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T11:24:30.113732image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
buddhism 10807
54.5%
roman 2442
 
12.3%
catholicism 2442
 
12.3%
islam 2053
 
10.4%
other 555
 
2.8%
christian 547
 
2.8%
denominations 547
 
2.8%
hinduism 407
 
2.1%
no 6
 
< 0.1%
religion 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 22568
14.3%
i 18705
11.9%
m 18698
11.9%
s 16803
10.7%
h 14351
9.1%
u 11214
7.1%
B 10807
6.9%
a 8031
 
5.1%
o 5990
 
3.8%
n 5043
 
3.2%
Other values (13) 25250
16.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 134659
85.5%
Uppercase Letter 19259
 
12.2%
Space Separator 3542
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 22568
16.8%
i 18705
13.9%
m 18698
13.9%
s 16803
12.5%
h 14351
10.7%
u 11214
8.3%
a 8031
 
6.0%
o 5990
 
4.4%
n 5043
 
3.7%
l 4501
 
3.3%
Other values (5) 8755
 
6.5%
Uppercase Letter
ValueCountFrequency (%)
B 10807
56.1%
C 2989
 
15.5%
R 2442
 
12.7%
I 2053
 
10.7%
O 555
 
2.9%
H 407
 
2.1%
N 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3542
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 153918
97.8%
Common 3542
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 22568
14.7%
i 18705
12.2%
m 18698
12.1%
s 16803
10.9%
h 14351
9.3%
u 11214
7.3%
B 10807
7.0%
a 8031
 
5.2%
o 5990
 
3.9%
n 5043
 
3.3%
Other values (12) 21708
14.1%
Common
ValueCountFrequency (%)
3542
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 157460
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 22568
14.3%
i 18705
11.9%
m 18698
11.9%
s 16803
10.7%
h 14351
9.1%
u 11214
7.1%
B 10807
6.9%
a 8031
 
5.1%
o 5990
 
3.8%
n 5043
 
3.2%
Other values (13) 25250
16.0%

marital_status
Categorical

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size127.2 KiB
Currently married (registered)
7417 
Never married
6330 
Currently married (customary)
1311 
Widowed
893 
Other
 
138
Other values (4)
 
181

Length

Max length33
Median length30
Mean length21.736755
Min length5

Characters and Unicode

Total characters353,657
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCurrently married (registered)
2nd rowCurrently married (registered)
3rd rowCurrently married (registered)
4th rowCurrently married (registered)
5th rowCurrently married (registered)

Common Values

ValueCountFrequency (%)
Currently married (registered) 7417
45.6%
Never married 6330
38.9%
Currently married (customary) 1311
 
8.1%
Widowed 893
 
5.5%
Other 138
 
0.8%
Separated (not legally) 64
 
0.4%
Not married but lives as a Family 52
 
0.3%
Divorced 44
 
0.3%
Legally separated 21
 
0.1%

Length

2024-12-06T11:24:30.227608image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-06T11:24:30.332100image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/