Biblio
Deadlock is one of the critical problems in the message passing interface. At present, most techniques for detecting the MPI deadlock issue rely on exhausting all execution paths of a program, which is extremely inefficient. In addition, with the increasing number of wildcards that receive events and processes, the number of execution paths raises exponentially, further worsening the situation. To alleviate the problem, we propose a deadlock detection approach called SAMPI based on match-sets to avoid exploring execution paths. In this approach, a match detection rule is employed to form the rough match-sets based on Lazy Lamport Clocks Protocol. Then we design three refining algorithms based on the non-overtaking rule and MPI communication mechanism to refine the match-sets. Finally, deadlocks are detected by analyzing the refined match-sets. We performed the experimental evaluation on 15 various programs, and the experimental results show that SAMPI is really efficient in detecting deadlocks in MPI programs, especially in handling programs with many interleavings.
ISSN: 2168-9253
Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.