Factors Influencing Data Partiality in Artificial Intelligence

FATEN ELINA KAMARUDDIN; NUR HANISAH  MOHAMAD RAZALI; AHMAD FUZI MD  AJIS; NUR RIFHAN  AB RAHIM; SITI NOORHASLINA  ABD HALIM; AINOL MARDHIYAH  RAHMAT

doi:10.22610/imbr.v16i3S(I)a.3861

FATEN ELINA KAMARUDDIN Universiti Teknologi MARA
NUR HANISAH MOHAMAD RAZALI Universiti Teknologi MARA
AHMAD FUZI MD AJIS Universiti Teknologi MARA
NUR RIFHAN AB RAHIM Universiti Teknologi MARA
SITI NOORHASLINA ABD HALIM Universiti Teknologi MARA
AINOL MARDHIYAH RAHMAT Universiti Teknologi MARA

DOI: https://doi.org/10.22610/imbr.v16i3S(I)a.3861

Keywords: Data partiality, artificial intelligence, black data, algorithm, user revise terminology

Abstract

This study proposes a conceptual framework to investigate factors influencing the data partiality in Artificial Intelligence (AI). However, the academic research on data partiality focusing on AI is limited across the bibliographic database sources. This study aims to address the gaps by proposing a developed framework that integrates three factors: the AI algorithm, black data, and user revise terminology highlighted in the past literature. The AI algorithm refers to the issues on the training data as a dataset used in the tools, which stimulates the data partiality as the outcome retrieved by the user. The black data is influencing data partiality on the existence of unknown data. The user revise terminology represented on the keywords used by the user to search for information, which incorrect keywords with not specify will lead to the AI to give all related information as an output without filter. The framework asserts that these three elements directly affect the partiality of data in AI. A quantitative methodology will be used in this study to cover the collection of survey data from the community under the MDEC program called Global Online Workforce (GLOW). The framework contributes a theoretical understanding of AI algorithms, black data, and user-revised terminology that influence data partiality in AI. In future research, the framework can be extended to test the data partiality in AI tools used in information agencies, as these bodies govern the safeguards of the accuracy of the information.

Downloads

References

Adegbilero?Iwari, O. E., Oluwadare, T., & Adegbilero?Iwari, I. (2023). Predictors of online health information-seeking behavior of non?medical undergraduate students might vary. Health Information & Libraries Journal. https://doi.org/10.1111/hir.12479 DOI: https://doi.org/10.1111/hir.12479

Agarwal, A., Agarwal, H., & Agarwal, N. (2022). Fairness Score and process standardization: a framework for fairness certification in artificial intelligence systems. AI And Ethics, 3(1), 267–279. https://doi.org/10.1007/s43681-022-00147-7 DOI: https://doi.org/10.1007/s43681-022-00147-7

Ahmed, W., & Ameen, K. (2017). Defining big data and measuring its associated trends in the field of information and library management. Library Hi Tech News, 34(9), 21–24. https://doi.org/10.1108/lhtn-05-2017-0035 DOI: https://doi.org/10.1108/LHTN-05-2017-0035

Akter, S., McCarthy, G., Sajib, S., Michael, K., Dwivedi, Y. K., D’Ambra, J., & Shen, K. (2021). Algorithmic bias in data-driven innovation in the age of AI. International Journal of Information Management, 60, 102387. https://doi.org/10.1016/j.ijinfomgt.2021.102387 DOI: https://doi.org/10.1016/j.ijinfomgt.2021.102387

Atman Uslu, N., & Yildiz Durak, H. (2022). The relationships between university students’ information-seeking strategies, social-media-specific epistemological beliefs, information literacy, and personality traits. Library & Information Science Research, 44(2), 101155. https://doi.org/10.1016/j.lisr.2022.101155 DOI: https://doi.org/10.1016/j.lisr.2022.101155

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021, March). On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency (pp. 610-623) DOI: https://doi.org/10.1145/3442188.3445922

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186. https://doi.org/10.1126/science.aal4230 DOI: https://doi.org/10.1126/science.aal4230

Canan Gungoren, Z., Gur Erdogan, D., & Kaya Uyanik, G. (2019). Examination of Preservice Teachers’ Lifelong Learning Trends by Online Information Searching Strategies. Malaysian Online Journal of Educational Technology, 7(4), 60–80. https://doi.org/10.17220/mojet.2019.04.005 DOI: https://doi.org/10.17220/mojet.2019.04.005

Chen, Z. (2023). Ethics and discrimination in artificial intelligence-enabled recruitment practices. Humanities and Social Sciences Communications, 10(1). https://doi.org/10.1057/s41599-023-02079-x DOI: https://doi.org/10.1057/s41599-023-02079-x

Corallo, A., Crespino, A. M., Vecchio, V. D., Lazoi, M., & Marra, M. (2023). Understanding and Defining Dark Data for the Manufacturing Industry. IEEE Transactions on Engineering Management, 70(2), 700–712. https://doi.org/10.1109/tem.2021.3051981 DOI: https://doi.org/10.1109/TEM.2021.3051981

DA, Z., Engelberg, J., & Gao, P. (2011). In Search of Attention. The Journal of Finance, 66(5), 1461–1499. https://doi.org/10.1111/j.1540-6261.2011.01679.x DOI: https://doi.org/10.1111/j.1540-6261.2011.01679.x

Daneshjou, R., Smith, M., Sun, M. & Rotemberg, V. & Zou, J. (2021). Lack of Transparency and Potential Bias in Artificial Intelligence Data Sets and Algorithms: A Scoping Review. JAMA Dermatology. 157. 10.1001/jamadermatol.2021.3129. DOI: https://doi.org/10.1001/jamadermatol.2021.3129

De La Peña, N. & Granados, OM. (2023). Artificial intelligence solutions to reduce information asymmetry for Colombian cocoa small-scale farmers. Information Processing in Agriculture [Internet]. Available from: https://doi.org/10.1016/j.inpa.2023.03.001 DOI: https://doi.org/10.1016/j.inpa.2023.03.001

Ferrara, E. (2023). Fairness and Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, and Mitigation Strategies. Sci [Internet]. 6(1), 3. Available from: https://doi.org/10.3390/sci6010003 DOI: https://doi.org/10.3390/sci6010003

Ferrer, X., Nuenen, T. V., Such, J. M., Cote, M., & Criado, N. (2021). Bias and Discrimination in AI: A Cross-Disciplinary Perspective. IEEE Technology and Society Magazine, 40(2), 72–80. https://doi.org/10.1109/mts.2021.3056293 DOI: https://doi.org/10.1109/MTS.2021.3056293

Gerards, J., & Zuiderveen, B. F. (2020). Protected Grounds and the System of Non-Discrimination Law in the Context of Algorithmic Decision-Making and Artificial Intelligence. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3723873 DOI: https://doi.org/10.2139/ssrn.3723873

Gignac, G. E., & Szodorai, E. T. (2024). Defining intelligence: Bridging the gap between human and artificial perspectives. Intelligence, 104, 101832. https://doi.org/10.1016/j.intell.2024.101832 DOI: https://doi.org/10.1016/j.intell.2024.101832

Hoeyer, K., Couturier, A., Barawi, K., Drew, C., Grundtvig, A., Lane, E., Munk, A. K., Whiteley, L. E., & Munsie, M. (2024). Searching for information about stem cells online in an age of artificial intelligence: How should the stem cell community respond? Stem Cell Reports, 19(2), 159–162. https://doi.org/10.1016/j.stemcr.2023.12.009 DOI: https://doi.org/10.1016/j.stemcr.2023.12.009

How Many Blogs are Published Per Day in 2023? (2023, July 28). Techjury. https://techjury.net/blog/blogs-published-per-day/

Jumabek, S., Tuwelbay, K., Allabay, A. & Geldibayev, B. (2024). Current Trends in The Formation of Mathematical Terminology on An Online Platform. International Journal of Advanced Science Computing and Engineering [Internet]. 6(1), 7–12. Available from: https://doi.org/10.62527/ijasce.6.1.188 DOI: https://doi.org/10.62527/ijasce.6.1.188

Kim, J, Giroux M, Lee JC. When do you trust AI? The effect of number presentation detail on consumer trust and acceptance of AI recommendations. Psychology and Marketing [Internet]. 2021 May 8;38(7):1140–55. Available from: https://doi.org/10.1002/mar.21498 DOI: https://doi.org/10.1002/mar.21498

Lorenz, K., Raviv, O., DIMOV, G., Dimova, M., Graneiro, A. L., & Dimov, V. (2024). Quality of Drug Allergy Information Obtained Via An Artificial Intelligence Bot Versus Clinical Guidelines. Journal of Allergy and Clinical Immunology, 153(2), AB154. https://doi.org/10.1016/j.jaci.2023.11.507 DOI: https://doi.org/10.1016/j.jaci.2023.11.507

Mager, A., Norocel, O. C., & Rogers, R. (2023). Advancing search engine studies: The evolution of Google critique and intervention. Big Data & Society, 10(2). https://doi.org/10.1177/20539517231191528 DOI: https://doi.org/10.1177/20539517231191528

Malaysia Digital Economy Corporation. (n.d.). MDEC. https://mdec.my/

Malek, M. A. (2022). Criminal courts’ artificial intelligence: the way it reinforces bias and discrimination. AI And Ethics, 2(1), 233–245. https://doi.org/10.1007/s43681-022-00137-9 DOI: https://doi.org/10.1007/s43681-022-00137-9

Marwala, T. & Hurwitz, E. (2015). Artificial Intelligence and Asymmetric Information Theory. arXiv (Cornell University) [Internet]. 2015 Jan 1; Available from: https://arxiv.org/abs/1510.02867

Md Ajis, A. F., Zakaria, S., & Ahmad, A. R. (2022). Modelling Dark Data Lifecycle Management: A Malaysian Big Data Experience. International Journal of Academic Research in Business and Social Sciences, 12(3). https://doi.org/10.6007/ijarbss/v12-i3/12363 DOI: https://doi.org/10.6007/IJARBSS/v12-i3/12363

Minsky, M. (1961). Steps toward Artificial Intelligence. Proceedings of the IRE, 49(1), 8–30. https://doi.org/10.1109/jrproc.1961.287775 DOI: https://doi.org/10.1109/JRPROC.1961.287775

Monchaux, S., Amadieu, F., Chevalier, A., & Mariné, C. (2015). Query strategies during information searching: Effects of prior domain knowledge and complexity of the information problems to be solved. Information Processing & Management, 51(5), 557–569. https://doi.org/10.1016/j.ipm.2015.05.004 DOI: https://doi.org/10.1016/j.ipm.2015.05.004

Ntoutsi, E., Fafalios, P., Gadiraju, U., Iosifidis, V., Nejdl, W., Vidal, M., Ruggieri, S., Turini, F., Papadopoulos, S., Krasanakis, E., Kompatsiaris, I., Kinder?Kurlanda, K., Wagner, C., Karimi, F., Fernandez, M., Alani, H., Berendt, B., Kruegel, T., Heinze, C. 7 Staab, S. (2020). Bias in data?driven artificial intelligence systems-An introductory survey. Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery, 10(3). https://doi.org/10.1002/widm.1356 DOI: https://doi.org/10.1002/widm.1356

Online, T. S. (2023, July 24). Changing the world for the better with IOT and Big Data. https://www.thestar.com.my/starpicks/2023/07/24/changing-the-world-for-the-better-with-iot-and-big-data

Outlook 2024 Latest News & Headlines-THE BUSINESS TIMES. (n.d.). The Business Times. https://www.businesstimes.com.sg/keywords/outlook-2024

Preckel, F., Golle, J., Grabner, R., Jarvin, L., Kozbelt, A., Müllensiefen, D., Olszewski-Kubilius, P., Schneider, W., Subotnik, R., Vock, M., & Worrell, F. C. (2020). Talent Development in Achievement Domains: A Psychological Framework for Within- and Cross-domain Research. Perspectives on Psychological Science, 15(3), 691–722. https://doi.org/10.1177/1745691619895030 DOI: https://doi.org/10.1177/1745691619895030

Redi, M., & Alameda-Pineda, X. (2019). Opinion column: Fairness, accountability and transparency (in multimedia). ACM SIGMultimedia Records, 11(3), 1–1. https://doi.org/10.1145/3524460.3524462 DOI: https://doi.org/10.1145/3524460.3524462

Raj, J. (2019). A Comprehensive Survey on the Computational Intelligence Techniques and its Applications. Journal of ISMAC, 01(03), 147–159. https://doi.org/10.36548/jismac.2019.3.002 DOI: https://doi.org/10.36548/jismac.2019.3.002

Saeidnia, H. (2023). Using ChatGPT as Digital/ Smart Reference Robot: How may ChatGPT Impact Digital Reference Services? SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4441874 DOI: https://doi.org/10.2139/ssrn.4441874

Saeidnia, H. R. (2023). Ethical artificial intelligence (AI): confronting bias and discrimination in the library and information industry. Library Hi Tech News. https://doi.org/10.1108/lhtn-10-2023-0182 DOI: https://doi.org/10.1108/LHTN-10-2023-0182

Schoser, B. (2023). Editorial: Framing artificial intelligence to neuromuscular disorders. Current Opinion in Neurology, 36(5), 424–426. https://doi.org/10.1097/wco.0000000000001190 DOI: https://doi.org/10.1097/WCO.0000000000001190

Schwartz, R., Schwartz, R., Vassilev, A., Greene, K., Perine, L., Burt, A., & Hall, P. (2022). Towards a standard for identifying and managing bias in artificial intelligence (Vol. 3, p. 00). US Department of Commerce, National Institute of Standards and Technology.

Schwartz, R., Vassilev, A., Greene, K., Perine, L., Burt, A., & Hall, P. (2022). Towards a standard for identifying and managing bias in artificial intelligence. https://doi.org/10.6028/nist.sp.1270 DOI: https://doi.org/10.6028/NIST.SP.1270

Sekaran, U., & Bougie, R. (2003). Research methods for business: A skill building approach. john Wiley & sons.

Sharit, J., Hernández, M. A., Czaja, S. J., & Pirolli, P. (2008). Investigating the Roles of Knowledge and Cognitive Abilities in Older Adult Information Seeking on the Web. ACM Transactions on Computer-Human Interaction, 15(1), 1–25. https://doi.org/10.1145/1352782.1352785 DOI: https://doi.org/10.1145/1352782.1352785

Tempke, R., & Musho, T. (2022). Autonomous design of new chemical reactions using a variational autoencoder. Communications Chemistry, 5(1). https://doi.org/10.1038/s42004-022-00647-x DOI: https://doi.org/10.1038/s42004-022-00647-x

Tsai, P. S. (2022). Research on information searching strategies in high school students’ quality of argumentative essay writing. Interactive Learning Environments, 31(10), 6799–6817. https://doi.org/10.1080/10494820.2022.2046108 DOI: https://doi.org/10.1080/10494820.2022.2046108

Wang, P. (2022). Intelligence: From definition to design. In International workshop on self-supervised learning (pp. 35–47). PMLR.

Wellings, S., & Casselden, B. (2017). An exploration into the information-seeking behaviors of engineers and scientists. Journal of Librarianship and Information Science, 51(3), 789–800. https://doi.org/10.1177/0961000617742466 DOI: https://doi.org/10.1177/0961000617742466

Zhou, M., Abhishek, V., Derdenger, T., Kim, J., & Srinivasan, K. (2024, March 5). Bias in Generative AI. arXiv.org. https://doi.org/10.48550/arXiv.2403.02726

Factors Influencing Data Partiality in Artificial Intelligence

Abstract

Downloads

References

Important Links