Identifying Foremost Factors Relevant to Blood Pressure Level Using Logistic Regression Analysis: A Case Study (Desert Dwellers Data)

The current research investigates the use of logistic regression as a statistical technique for modelling real blood pressure (BP) data. This study uses a dataset collected from a desert community in southwestern Libya. Six factors that are widely believed to play an important role in the process of BP were considered. Statistical analyses of the available dataset revealed that the main cause of hypertension in such community is age. The proposed multiple logistic regression analysis also revealed that two factors, age and systolic BP, showed greater significance among the six examined variables. These two variables were identified as having a significant effect on blood pressure performance. Based on a determined criterion, each page as the main cause of hypertension in such community participants was classified as hypertensive or not, significant variables were selected based on the p-value associated with the model significance level, and these factors were selected based on the criteria to achieve the model significance level (p < 0.05). The statistical analysis was carried out using R language. )


Introduction
Around the world, there are several studies and research works that are concerned with blood pressure (BP) and what is related to it.In particular, high blood pressure (HBP).HBP is considered one of the top diseases threatening human life and leading global risks for mortality [1,2].In such studies, different approaches and aspects are adopted when investigating the BP mechanism.Almost the aim of such investigations is to identify the main causes of HBP and relevant issues of this disease.However, HBP is a worldwide community  [3].BP has a tendency to rise as people get older; thus, everyone's risk for hypertension increases with age.Hypertension can be hereditary.The risk of HBP increases when transmissible factors are combined with unhealthy lifestyle choices [4], [5].
According to [6] in Libya, approximately 68.4% of Libyan males and 48.4% of Libyan females who suffer from HBP are under treatment.High systolic blood pressure, high body-mass index, and dietary risks were the leading risk factors in 2013.Additionally, [6] pointed out that between 26-41% among adult Libyan women and 11-21% among adult Libyan men (approximately 64% of Libyan adults are either overweight or obese), obesity progressively increases with age and is two times more common among Libyan women than men.Hypertension is a common comorbidity of diabetes.
According to [6], BP measurements are important for assessing the health of children and adolescents as well as adults.The mean values of all measurements, that is, height, weight, pulse rate, systolic blood pressure (SBP) and diastolic blood pressure (DBP), were higher among males than in females, and body mass index (BMI) and body fat percentage were higher among females than males.There was a significant positive correlation between BMI, lipids and blood pressure for both SBP and DBP.The odds ratio showed that overweight/obese subjects were more likely to develop hypertension than those with a normal body mass index and suggested early clinical detection of hypertension and intervention, including lifestyle modification, especially weight management.
HBP disease is widely known as hypertension, whereas hypertension is one of the pioneer diseases from which almost everyone is suffering.Hypertension is the main familiar cardiovascular disorder, causing heart warming in approximately one billion people worldwide and accounting for approximately seven million deaths annually.Some of the known risk factors for primary hypertension, such as age, genetics, and sex, are nonmodifiable.On the other hand, the majority of other risk factors, such as tobacco use, alcohol use, injurious diet, physical inactivity, overweight and obesity, can be effectively prevented [7,8].Therefore, HBP might be one of the challenges that the health services sector is facing and fighting globally.This is due to its high frequency and associated risks of a number of the most dangerous human diseases, including cardiovascular and kidney diseases.Cardiovascular and kidney diseases have been recognized at the top of factors causing mortality and rank third as a cause of disability-in-tune life-years [9].They mentioned that various risk factors have been associated with hypertension, including age, sex, race, physical activity, and socioeconomic class.Additionally, they mentioned that the vast majority of cases of uncontrolled hypertension occur in individuals more than 60 years of age.Resident studies have also shown that BP correlates with body mass index (BMI) and other anthropometric indices of fatness, such as waist-hip ratio.In the Framingham (Reference) study, 70% of new cases of hypertension were related to excess body fat, see, e.g., [10] and [11].
The reported incidence of hypertension varies around the world.Differences in blood pressure also exist within communities in the same country, with the prevalence of hypertension in populated countries ranging from approximately 20% to 50% [12].In the Asia-Pacific region, the incidence of hypertension ranges from 5 to 47% in men and from 7% to 38% in women [13].Hypertension is the main risk factor for death in sub-Saharan Africa [1], and the normal prevalence of hypertension in the adult population (over 25 years) in sub-Saharan Africa ranges from 38% to 56% in 2008 compared to 30% in the United States and 26% to 44% in Western Europe [2,14].
According to [15], risk factors for hypertension among women are age, income, employment status, lack of physical activity, inherited hypertension, stress and insomnia.Low consumption of fruit, awareness of smoking, stress and lack of physical activity cause cardiovascular disease, and it was found through their study that the main factor contributing to the incidence of cardiovascular disease is HBP.
According to [16], female patients declining in the 36-48 years age group mostly suffer from stress that causes blood pressure.The most important factors that affected blood pressure during stress were children's marriages, family issues, low income and workload.[11] estimated the remaining lifetime probability of developing hypertension in both male and female adult US citizens and found that 90% of participants aged 55-65 years had hypertension.Women had a higher likelihood of increased blood pressure than men.[17] Required data were obtained from women aged 35-55 years using questionnaires in Korea for analysis.They applied logistic regression models and concluded that OC use and hypertension are directly related to each other.Korean women who used OC experienced HBP compared to other women who never used OC.In [18], multiple logistic regression analysis was applied to the data, which were collected through a questionnaire-based survey to predict the risk factors for hypertension in the population of Gaza Strip, Palestine.They concluded that high blood pressure is closely related to lack of physical activity, education, obesity, low income and a family history of high blood pressure.Their study also showed that women had a higher rate of high blood pressure than men.Using statistical analysis techniques, a study by [4,19] examined the causes of high blood pressure and cardiovascular disease among inhabitants of a village in Brazil aged 18 years or more of both sexes.The results of the statistical analysis showed that approximately 30% of the participants had high blood pressure, and no statistically major variation was found between men and women.
According to [20,21], hypertension, low consumption of fruits, smoking, education, stress, and physical inactivity cause cardiovascular disease.It was evident from their study that the major contributing factor to cardiovascular disease was high blood pressure.The logistic regression model has become a standard statistical method of analysis in many situations over the last decade and has been used extensively in medical research studies, in addition to helping to make medical decisions.The paper is organized into five sections: introduction, methodology, data and statistical analysis and results, discussion and finally conclusion.

Aim and objectives
The objectives of this study were to determine the prevalence of hypertension and its associated risk factors in Al-Shati Libya.The data were collected by one of the co-authors in 2006 from a public health services centre that was established to serve villager communities of more than 10 thousand located within the studied area.JOPAS Vol.22 No. 2 2023 24 3 Materials and Methods

Description of the collected data
The systematic random sampling technique (SRS) was used to draw the sample where the first object was selected randomly and the remaining sample objects were selected by each 5 arrived persons.A total of 111 participants, 55 patients (test group), and 56 healthy persons (control group), were referred to the hospital.The study depends upon a sample of persons living in a region located in the southwest part of Libya.The region is considered a villager community and is characterized by warm temperatures year-round, and low annual precipitation and extreme weather conditions can lead to very cold temperatures.The temperature average usually exceeds 40 °C in the summer months (May to Sep) and is less than zero Celsius, i.e., nightly frozen in winter months (Dec and Jan).The involved individuals believed that they had no known experience suffering from HBP disease.Data on the following variables were collected: age, sex, weight, height, systolic and diastolic.The participation in the study was willing of the picked person to agree to participate and give the specific data according to a predesigned questionnaire.Alkalimi, one of the authors, was responsible for collecting the data and filling out the questionnaire.The readings of BP measurements of each selected volunteer were repeated on three separate occasions.Then, the arithmetic means for systolic blood pressure (SBP) and diastolic blood pressure (DBP) readings for the period length time between each two successive occasions was almost 24 hours.You can say that one measurement in each day.The observed data for BP were the average of the three readings.A positive diagnosis of hypertension was made when the systolic blood pressure (SBP) reading was ≥ 140 mmHg and/or diastolic blood pressure (DBP) ≥ 90 mmHg.Otherwise, the individual was classified as free from hypertension.

Data Analysis of collected data
A logistic regression model was developed by [22,23] using multivariate analysis in the Framingham Heart Study.Since then, the logistic regression model has become a standard binary or dichotomous data analysis in various areas of medicine.
Analysed statistically for data was tabulated and cross tabulated using R-Programming Language.The reporting of the odds ratio to establish the risk for hypertension was considered to be a significant probability with a value of p < 0.05.Binary logistic regression analysis was conducted.A set of logistic regression models is fitted in this section to explore the underlying association between hypertension and the selected explanatory variables.

Modelling of Data using Logistic Regression
Let us consider the dichotomous variable for hypertension.For this, we defined treatment (Treat) as follows: Treat= 0 if there is no existing hypertension and Treat= 1 if hypertension exists Then, the logit for multiple logistic regression is given by Equation (1). and ' s  are typically estimated by the maximum likelihood (ML) method, which is preferred over the weighted least squares approach by several authors, e.g., [21,24].
The ML method is designed to maximize the likelihood of reproducing the data given the parameter estimates.The dependent variable was entered into the analysis as 0 or 1 coding for the treat and continuous values for continuous predictors coding (e.g., 0 or 1) for categorical predictors.
The null hypothesis underlying the overall model states that all ' s  equal zero.A rejection of this null hypothesis implies that at least one β does not equal zero in the population, which means that the logistic regression equation predicts the probability of the outcome better than the mean of the dependent variable Treat.The interpretation of results is rendered using the odds ratio for both categorical and continuous predictors.
On the other hand, 120/80 is considered normal blood pressure, where 120 corresponds to systolic and 80 to diastolic.A reading of 140/90 or higher is referred to as high blood pressure.Information concerning risk factors that contribute to high blood pressure was taken.We used binary logistic regression to identify the risk factors that affect the level of blood pressure.
The binary logistic regression model is as follows:  

Logistic Regression Models
From Table 2, there are two significant factors for the multiple logistic regression models.The first factor was age (p value<0.001),and the second factor was systolic (p value < 0.0001).Each entry of significant factors in the model gives the p value varies depending on certain measures.(0.926).These interpretations suggest that both age and systolic blood pressure are important factors in predicting hypertension, with lower values of these variables being associated with a lower risk of developing hypertension.

Hosmer and Lemeshow Test
The Hosmer and Lemeshow test is based on grouping cases into deciles of risk.It compares the observed probability with the expected probability within each decile.The Hosmer-Lemeshow test is a commonly used goodness of fit test that is used to test the calibration of a logistic regression model.In this test, the p value is checked.If it is greater than 0.05, there is no significant difference between the observed probability and the expected probability.Based on Table 3, the p value =0.4 obtained is greater than 0.05, suggesting that the model fit the data well.In other words, the null hypothesis of a good model fit to the data was tenable [25].

Receiver Operating Characteristic Curve (ROC)
The ROC curve is a fundamental tool for diagnostic test evaluation.It is a graphical plot of the sensitivity that measures the overall performance of a diagnostic test.
The ROC curve takes on any value between 0 and 1, since both the x-and y-axes have values ranging from 0 to 1.
The closer the area is to 1.0, the better the test is, and the closer the area is to 0.5, the worse the test is.The larger the area, the better the diagnostic test achieved.If the area is 1.0, we have an ideal test because it achieves both 100% sensitivity and 100% specificity.If the area is 0.5, then we have a test that effectively has 50% sensitivity and 50% specificity [26,27].In practice, a diagnostic test will have an area somewhere between these two extremes.
Figure 1.The ROC curve and Table 3 show that the AUROC (area under the receiver operating characteristic curve) for the hypertension data is 0.72.This is interpreted as the probability that a patient who does not have hypertension is greater than that for a patient who has hypertension.

Discussion
In this study, the variable SBP was significantly different from the other variables.This finding is at odds with numerous other studies that have reported that systolic exercise is a strong predictor of hypertension [28].During the carrying of this study, an attempt was made to determine the association between different risk factors and hypertension by binary logistic regression analysis.
The prevalence of hypertension increases with increasing age.Correspondingly, a health survey in the United States reported a strong correlation between age and blood pressure and in England 2003 [29].In central Malaysia, the prevalence of hypertension among those aged 55 years and above was shown to be 25.6% and 51.1% among those living in old folks' homes.The prevalence of hypertension among the elderly in this study is comparable with the findings of another study conducted in northern Malaysia [30].Heaviness is an entrenched risk factor for hypertension [31].Additionally, this study reveals that the prevalence of hypertension increased with weight.In the Ansan City study conducted in Korea, weight and abdominal boundary were found to be risk factors for hypertension [32].Elsewhere in Asia, the prevalence of overweight and hypertension was most common in Japan, followed by Iran, urban India, Singapore, urban Sri Lanka, and the urban Philippines [33].[26] In their study, an attempt was made to determine the association between different risk factors and hypertension by binary logistic regression analysis.The result was that the variable systolic blood pressure was more significant than the other variables, and through multiple logistic regression analysis, three factors were identified as having a significant influence on the performance of human blood pressure (hypertension).These factors are age, body mass index (BMI) and systolic.These factors are selected based on the criteria to achieve the level of model significance (p value < 0.05).

Conclusion
The present exploration was associated with the performance of human blood pressure reactions to the manipulation of certain factors to determine the probability of a person concluding the incidence of hypertension.Performances of direct human blood pressure have a significant impact on heart disease.It can be concluded from the present study that the foremost cause of high blood pressure is age.Through multiple logistic regression analysis, two factors, age and systolic BP, which were the most significant among the six tested factors, were identified as significantly influencing the performance of human blood pressure (hypertension).These two factors are selected based on the criteria to achieve the level of model significance (p value< 0.05).In conclusion, these two factors can affect the performance of blood pressure at risk for hypertension.

Table 2 :
Estimates of parameters of logistic regression model for hypertension

Table 3 :
Hosmer and Lemeshow Test

Table 4 :
Area under the receiver operating characteristic curve (AUROC) for hypertension Fig. 1.Receiver operating characteristic (ROC) curve for hypertension