NLP for Standards

Here's a look at how collaborative computing using natural language processing (NLP) can contribute to standards work, and a call to action to join an open source project to make it happen.

Advances in AI

AI-generated image of AI and NLPAs our dependence on computers grows, we speculate that they could in time replace humans as the primary intelligence on our planet.

There have been some spectacular recent advances in artificial intelligence (AI). Last year, an AI-generated artwork won a prize in the Colorado State Fairs annual art competition, causing some consternation in the artist community. When I was discussing the ideas in this article, we asked Midjourney, which produced the prizewinning picture, to knock up something about AI and NLP. After some tuning of the input parameters, we obtained the image shown here. It is certainly striking, and I would even say artistic, though it is also rather disconcerting.

There have also been notable developments in NLP, which is a major branch of AI. The ChatGPT program that was released last November has caused quite a stir, with people claiming that it can write essays and even poems. I asked it, "How can natural language processing help with standards development?" The answer is given at the end. You can judge for yourself whether it was worth me continuing to write, or whether ChatGPT did a better job.

Collaborative Intelligence

The Midjourney image suggests a future in which people are machine slaves with virtual reality implants and hands that have grown extra fingers to type on keyboards. Many people take a more optimistic view. Augmented intelligence, a pattern in which AI supports humans and makes them more effective, was an early 2022 buzzword. Then we had collaborative intelligence, in which AI becomes an integral part of human collaborative working.

Garry Kasparov knows a thing or two about human versus artificial intelligence. He is the grandmaster who famously lost at chess to Big Blue back in the 90s. Since then, he has been exploring collaboration between people and AI. Writing with David De Cremer in the Harvard Business Review, he describes an experiment showing that a well organised partnership between people and machines can be superior not only to the best people but also to the best machines. They conclude that AI should augment human intelligence, not replace it. This is a good insight. It shows how use of AI can and should develop.

Standards Work

I have been concerned with standards for a very long time. I have participated in many standards developments, managed some, and seen the power of standards in the IT industry, most notably with the UNIX® standard that gave us application portability, and the Internet standards that gave us universal connectivity. More recently, we are seeing a proliferation of API standards which, though more limited in scope, give us significant interoperability. Standards work depends heavily on collaboration, and I want to explore how it can be supported by AI.

A standard starts with agreement on ideas. The UNIX definition came after many years of discussion of operating systems and schedulers. The Internet was adopted after a long period of argument about networking. A cloud storage API relies on an understanding of cloud data stores and how they are accessed.

How NLP Can Help

Topics and ideas from sections and sentencesNLP and AI can help identify and crystallise ideas, so that standards workers can discuss them and reach agreement. There has never been so much information freely available as there is now. The Web has a huge amount of material. This is presented in pages that contain sections and subsections, which in turn contain sentences, often using different words to describe the same topics and ideas. NLP technology is now able to identify the common ideas, so that experts can consider them as a basis for standards.

The Open Group is a consortium that was known many years ago as the owner of the UNIX standard, and has been known more recently as the home of standards for enterprise architecture. Its Data Integration Work Group is developing a Guide to the use of The Open Group Standards for Data Integration, and is exploring the use of NLP technology to facilitate research into underlying concepts, use cases, and even into The Open Group publications, which are now so extensive that no individual knows them all. 

Lacibus participates in this work, and has produced an initial implementation of an open source Ideas Browser that extracts topics and ideas from Web pages and enables users to browse them at the ideas level without having to read all the words. This is an ideal application for virtual data lakes. The figure shows a screenshot of the browser displaying one of the topics extracted from  a list of pages related to Data Mesh.

Topic: Who has actually implemented a data mesh?

This is a very basic implementation. It uses some of the huge amount of open source NLP software that is now freely available. It can be improved massively with further development.

Call to Action

We are looking for volunteers to join the project and turn the initial implementation into a solid open source product. If you are interested in natural language processing, and can contribute to development of the NLP algorithms or the user interface, to testing, or to project management, we would love to hear from you. Please send me e-mail.


What ChatGPT Said

Finally, here's ChatGPT's reply to the question, "How can natural language processing help with standards development?"

"Natural language processing (NLP) can be used to help with standards development in several ways. For example, NLP can be used to extract information from unstructured text documents, such as specifications or requirements documents, and convert it into a structured format that is easier to analyze and work with. This can be useful for identifying common themes or patterns in the text, as well as for identifying potential issues or inconsistencies."

"NLP can also be used to facilitate communication and collaboration among team members working on standards development. For example, NLP-powered tools can be used to automatically generate summaries of long documents or to highlight key points in a document, making it easier for team members to stay informed about the latest developments. Additionally, NLP can be used to facilitate online discussions and meetings, by automatically transcribing spoken comments or generating summaries of group discussions."

"Overall, NLP can be a powerful tool for improving the efficiency and effectiveness of the standards development process."