The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

大模型教育出版

作者：余快

2023/09/04 16:47

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

GPT ignited the era of Large Models, and this raging fireworks has made the education industry experience a dual cycle of ice and fire: some explicitly prohibit it, while others actively embrace it.

To some extent, GPT, which can answer 'any question', is indeed a 'thorny rose'. But we cannot deify GPT, nor can we view it as a raging beast.

Large Models has been running wild for most of the past six months, and GPT has penetrated everywhere. It is forcing changes in the education field, and we have to rethink the impact and challenges AI has brought to the education field.

Understanding the operating principles behind GPT, understanding technology and utilizing it reasonably to avoid "technological alienation" are issues that need to be systematically considered in the field of education.

How to grasp the changes and invariance of education under the profound influence of GPT?

The GAIR Conference, jointly organized by GAIR Research Institute, World Scientific Publishing, Kotler Consulting Group, and Lei Feng Network, was successfully held in Singapore from August 14th to 15th. In response to the latest trends in the current technology industry, on the afternoon of the 15th, the "Exam oriented and Elite Education" sub forum invited three practitioners in the education field to share their insights.

General Manager of World Scientific Chi Wai (Rick) Lee：Is the education system being disrupted by GPT?

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

Chi Wai (Rick) Lee, as the chairman of this special session, shares his views on the changes that GPT has made to education.

People often say that healthcare and education are the greatest livelihood. For example, Professor Xu Dong's life science seminar uses AI to unravel the mysteries of the human body and medicine. The education seminar will explore the impact of GPT on the education industry and even the entire teaching model.

In the field of publishing, ChatGPT has attracted widespread attention recently. At the beginning of this year, a medical research paper published in an academic journal citing ChatGPT as one of the co-authors. This created many new problems for the publishing industry, not just the role of ChatGPT as an author but also the role of academic publishers which are considered gatekeepers of the publications of academic research.

Rick mentioned that after that incident, all major academic publishers around the world must immediately come up with a position statement on authorship and AI tools to further emphasis the accountability of authors on the originality and integrity of the contents.

Following the significant emergence of ChatGPT this year, there is a need for a more comprehensive investigation into the influence of ChatGPT on areas such as research, publishing, higher education, and related areas.

Participating in this session of ChatGPT + Education are speakers such as Peter Schoppert from the National University of Singapore Press, Aaron Tay from the Singapore Management University Library, and Professor Lai Choi Leng from the National University of Singapore. With distinct backgrounds, professional journeys, and topics of discussion, their common focal point is the emergence of GPT on the transformation of education.

President of NUS Press Peter Schoppert: Rethinking Copyright, Privacy, and Responsibility under LLMs

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

Peter Schoppert reviewed the working principles of LLMs. He pointed out that it is necessary to revisit how the three key elements of training data, parameters and modular character of their deployment work, as well as the resulting three issues: copyright, privacy, and liability.

Firstly, there is a high amount of legal uncertainty in copyright. For example, most of the text used by Google to train models comes from Google Books. However, when the author uploaded the book to Google Library, they were unaware that Google used the book for model training, as well as various news reports and other training texts, and the copyright of these materials was not clear.

Peter proposed two claims, one is whether training is “fair use”. He analyzed four factors: The Purpose and Character of the Use, The Nature of the Copyrighted Work、The Amount or Substantiality of the Portion Used, The Effect of the Use on the Potential Market for or Value of the Work.

For example, if a self-driving car system is trained on photographer's photos to recognize stop signs, he will not care too much, but if it is trained on his artwork to create other artwork in his style, that is another story. The big model is encroaching on their field through the works of these artists. The copyright issue here is worth considering.

The second is that training was done under exception. In this case, commercial / non-commercial distinction is very slippery where most exceptions do not include a communication right, and most exceptions have guardrails that might rule out AI-training. What's more, some of the works copied were from sites known to be infringing, for example, The Pile's books have 196,640 e-books, most of which are pirated.

Secondly, in terms of privacy, how to remove PII.

Models have large amounts of PII inside them, from the sources they were trained on. GDPR and other privacy laws require that individuals can have their PII removed on request.

As pointed out by Peter “How would you do that with a Large Language model? at scale?”

OpenAI responded by saying, “While some of our training data includes personal information that is available on the public internet, we want our models to learn about the world, not private individuals. So we work to remove personal information from the training dataset where feasible, fine-tune models to reject requests for personal information of private individuals, and respond to requests from individuals to delete their personal information from our systems.”

Third, who is responsible for the crazy things the models say?

What do you do when ChatGPT starts saying that you are a sexual predator?

In fact, this happened to George Washington University Law Professor Jonathan Turley, citing made-up newspaper articles. For example, when complained to Microsoft, Microsoft said the company is taking steps to ensure search results are safe and accurate.

Models are not truth-tellers. They are trained to be plausible, not correct. The technical term for this is “bullshitter”.

Peter finally proposed that, we should be active if not activist: resist the idea that it is all up to Google, OpenAI, Microsoft, Facebook, et al. Because we have a stake in copyright, privacy and liability systems that have evolved over centuries, let alone bias, environmental impact, economic concentration, etc. We have no choice but to learn more about this as students will certainly be using it and we need to know enough about how it works to guide them.

Library Administrator of SMU Aaron Tay：How to use generative AI search tools more effectively?

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

Aaron Tay pointed out the three discoveries about Transformer based Large Language Models such as GPT4, BERT, PaLM2, LLaMA etc.: it improves relevance, it generates direct answers, and it also extracts information from papers (abstract and full-text) to enhance search engine result pages.

When asked 'Who won the FIFA world cup in 2022?', ChatGPT stated that it cannot answer because its latest training data is September 2021. Therefore, ChatGPT alone cannot cut it, as it lacks currency in data trained on and no way to verify answers.

One solution is about retriever augmented generation through external data resources. A deeper solution is to provide prompts such as: Given the text below, answer the question 'Who won the FIFA World Cup in 2022?', and then insert the relevant retrieved text.

In fact, there are more solutions. AI like ChatGPT is entering the main search engines, and Scopus, Dimensions, and Web of Science databases are introducing conversational AI search.

These search tools have two main limitations: firstly, they are very new, with unclear accuracy and reliability, and secondly, the current tools have limited data (most of these tools currently rely heavily on open scholarly metadata and open access papers).

Aaron pointed out that in user education, users should be aware of the weaknesses of generative AI, consider whether these tools can increase the efficiency and effectiveness of fake news, and the long-term effects on the use of such technology. In terms of reference work, should consider whether these technologies will further reduce the need for reference desks, and whether they are capable and easy to maintain chatbots as first tier support now in sight.

In response to all these issues, Aaron proposed different directions for improvement like

Sharing and education of solutions, everyone can share the latest discoveries, and Aaron Tay has published an article on SMU introducing the latest tools such as Bing.

Module for use of AI for students in SMU that includes clarify expectations on how students should use AI responsibly in SMU, enable students to use AI appropriately for academic success, and encourage students to be more thoughtful about using AI tools for learning and growth.

SMU is about to hold a hackathon on utilizing GPT to develop creative applications and enhance users' library and research experiences.

Honorary Professor of NUS LAI CHOY HENG: Who’s afraid of ChatGPT? Or should we?

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023

LAI CHOY HENG raised the first question: Who’s afraid of ChatGPT? Or should we?

More specifically, what are the concerns of the education community? In fact, both the public and the education community are concerned about the emergence of AI generation.

"We built, it, we trained it, but we don’t know what it’s doing." （Sam Bowman, AI research at NYU）

Obviously, all new technologies have inherent uncertainty in what they do for people. ChatGPT is no exception. Lai mainly discussed the following core issues:

What are the concerns/threats?
Use of technology in education
Opportunities - adopting and adapting Generative AI
Innovation and educational reform

A primary concern is suspicion of academic integrity. Nowadays, perfect grammar and artistic composition have raised people's awareness, after all, the potential uses of ChatGPT/Generative AI cover an enormous range:

Automated generation of writing, background research, detailed feedback on human-written text, writing computer code, solving problems, answering conceptual questions.

Of course, ChatGPT/Generative AI is not perfect. Firstly, ChatGPT may provide incorrect and absurd answers. Rumors widely spread on social media may be seen as facts by ChatGPT. At present, ChatGPT still finds it difficult to fully understand and analyze the logical relationships inherent in information, and such factual errors can also mislead students. Therefore, we must make a judgment on all the information obtained.

Another challenge comes from bias and prejudice, which are not yet easy to eliminate among humans, let alone machines. Integrity and ethical issues are widely debated in the AI community, followed by issues of equity and access, which may not be accessible to vulnerable regions or countries around the world. This will be a very tricky issue.

Secondly, regarding the application of technology in education, Lai compared Google with ChatGPT.

ChatGPT users spend less time, but not significantly different in overall task performance.

ChatGPT users across different education levels excel in answering straightforward questions and providing general solutions but fall short in fact-checking tasks.

ChatGPT’s responses perceived as having higher information quality.

ChatGPT users report better user experiences – useful, enjoyable, satisfying.

These results are not surprising. Anyway, key difference between search engine and ChatGPT:

Search engines typically do not retain an evolving history of an answer but returns a list of discrete links to resources based on the ranking of specific search terms.

ChatGPT offers follow-up questions that develop and expand answers and respond to challenges posed by the questioner. In other words, the results obtained from the search engines requires fully human scrutiny and filtering by the user.

How do we deal with ChatGPT, or more generally, AI, in education?

He pointed out several broad uses of ChatGPT based on a recent survey:

Dreaming - helping you think；Drudgery - lightening your load；Design - Building your content；Development - advancing you work.

He mentioned that this indicates the fact that ChatGPT can make our lives easier in principle, but it does not necessarily make us better teachers and educators.

There have been a flurry of investigations/surveys on the pros and cons of ChatGPT in education. For examples:

It performs well for structure tasks but "struggles with nuanced tasks". Once you're not so straightforward, ChatGPT may get into trouble. Moreover, its answer is sometimes incorrect.

But its innovation in educational reform lies in its ability to stimulate subsequent questions through continuous dialogue, which is the core of interactive learning and the essence of Socratic dialogue. This kind of dialogue may be a very constructive and beneficial form of education, although it is not always correct.

Lai borrowed chaos theory and explained that the edge of chaos refers to a transition zone of bounded instability and engenders a constant dynamic interplay between order and disorder.

Lai stated that he is an optimist and as long as we can face ChatGPT squarely, it can have an impact on the transformation of education. Therefore, we should let new technologies serve us, rather than let them determine how we should carry out our education work.

After the speech, a roundtable forum on ChatGPT and education pushed the atmosphere of the entire venue to a climax. Under the leadership of Rick Lee （General Manager of World Science and Technology Press）, Peter Schoppert, Aaron Tay, and Lai Choy Heng had a heated discussion on topics such as the application of ChatGPT in the field of education.雷峰网雷峰网(公众号：雷峰网)雷峰网

The Lagre Model Era，Ignite the Flame of Education Reform | GAIR 2023