Document Type : Research Paper

Author

Department of Computer Science, Faculty of Mathematical Sciences, University of Guilan, Rasht, Iran.

Abstract

Multi-label learning is an emerging research direction that deals with data in which an instance may belong to multiple class labels simultaneously. As many multi-label data contain very large feature space with hundreds of irrelevant and
redundant features, multi-label feature selection is a fundamental pre-processing tool for selecting a subset of most representative and discriminative features. This paper introduces a Python-based open-source library that provides the state-ofthe-art information theoretical filter-based multi-label feature selection algorithms. The library, called PyIT-MLFS, is designed to facilitate the development of new algorithms.  It is the first comprehensive open-source library for implementing algorithms of multilabel feature selection. Moreover, it provides a high-level interface that enables the end-users to test and compare different already implemented algorithms. PyIT-MLFS is available from https://github.com/Sadegh28/PyIT-MLFS.

Keywords

Main Subjects

  1. Deshpande, H., Singh, A., & Herunde, H. (2020). Comparative analysis on YOLO object detection with OpenCV. International journal of research in industrial engineering9(1), 46-64.
  2. Dionisio, A., Menezes, R., & Mendes, D. A. (2004). Mutual information: a measure of dependency for nonlinear time series. Physica A: statistical mechanics and its applications344(1-2), 326-329.
  3. Doquire, G., & Verleysen, M. (2011, June). Feature selection for multi-label classification problems. International work-conference on artificial neural networks(pp. 9-16). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21501-8_2
  4. Harris, Ch. R., Millman, K. J., … & Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585, 357– 362. DOI: 1038/s41586-020-2649-2.
  5. Hashemi, A., Dowlatshahi, M. B., & Nezamabadi-Pour, H. (2020). MFS-MCDM: Multi-label feature selection using multi-criteria decision making. Knowledge-based systems206, 106365. https://doi.org/10.1016/j.knosys.2020.106365
  6. Herunde, H., Singh, A., Deshpande, H., & Shetty, P. (2020). Detection of pedestrian and different types of vehicles using image processing. International journal of research in industrial engineering9(2), 99-113.
  7. Hong, R., Wang, M., Gao, Y., Tao, D., Li, X., & Wu, X. (2013). Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE transactions on cybernetics44(5), 669-680.
  8. Lee, J., & Kim, D. W. (2013). Feature selection for multi-label classification using multivariate mutual information. Pattern recognition letters34(3), 349-357.
  9. Lee, J., & Kim, D. W. (2015). Mutual information-based multi-label feature selection using interaction information. Expert systems with applications42(4), 2013-2025.
  10. Lee, J., & Kim, D. W. (2017). SCLS: multi-label feature selection based on scalable criterion for large label set. Pattern recognition66, 342-352.
  11. Li, L., Liu, H., Ma, Z., Mo, Y., Duan, Z., Zhou, J., & Zhao, J. (2014, December). Multi-label feature selection via information gain. International conference on advanced data mining and applications(pp. 345-355). Springer, Cham.
  12. Lin, Y., Hu, Q., Liu, J., & Duan, J. (2015). Multi-label feature selection based on max-dependency and min-redundancy. Neurocomputing, 168, 92–103. https://doi.org/10.1016/j.neucom.2015.06.010
  13. Liu, S. M., & Chen, J. H. (2015). A multi-label classification based approach for sentiment classification. Expert systems with applications42(3), 1083-1093.
  14. Lv, S., Shi, S., Wang, H., & Li, F. (2021). Semi-supervised multi-label feature selection with adaptive structure learning and manifold learning. Knowledge-based systems214, 106757. https://doi.org/10.1016/j.knosys.2021.106757
  15. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: machine learning in Python. The journal of machine learning research12, 2825-2830.
  16. Pradeep, N., Rao Mangalore, K. K., Rajpal, B., Prasad, N., & Shastri, R. (2020). Content based movie recommendation system. International journal of research in industrial engineering9(4), 337-348.
  17. Pereira, R. B., Plastino, A., Zadrozny, B., & Merschmann, L. H. (2018). Categorizing feature selection methods for multi-label classification. Artificial intelligence review49(1), 57-78.
  18. Singh, A., Herunde, H., & Furtado, F. (2020). Modified Haar-cascade model for face detection issues. International journal of research in industrial engineering9(2), 143-171.
  19. Spolaôr, N., Cherman, E. A., Monard, M. C., & Lee, H. D. (2013). A comparison of multi-label feature selection methods using the problem transformation approach. Electronic notes in theoretical computer science292, 135-151.
  20. Szymanski, P., & Kajdanowicz, T. (2019). Scikit-multilearn: a scikit-based Python environment for performing multi-label classification. The journal of machine learning research20(1), 209-230.
  21. Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., & Vlahavas, I. (2011). Mulan: a java library for multi-label learning. The journal of machine learning research12, 2411-2414.
  22. Vergara, J. R., & Estévez, P. A. (2014). A review of feature selection methods based on mutual information. Neural computing and applications24(1), 175-186.
  23. Yang, Y. H., & Chen, H. H. (2012). Machine recognition of music emotion: a review. ACM transactions on intelligent systems and technology (TIST)3(3), 1-30.
  24. Zhang, P., Liu, G., & Gao, W. (2019). Distinguishing two types of labels for multi-label feature selection. Pattern recognition95, 72-82.
  25. Zhang, P., Liu, G., Gao, W., & Song, J. (2021). Multi-label feature selection considering label supplementation. Pattern recognition120, 108137. https://doi.org/10.1016/j.patcog.2021.108137