Here’s how algorithms are made

Author Florian Jaton sheds light on the human side of algorithms

Ground-truthing

When a group of computer scientists, researchers, or engineers gather to create an algorithm, they are initially driven by a set of elements, including desires, skills, means, and hopes. For example, a research group might want to contest or outperform the results of a previously published scientific paper. The members of the team have a set of mathematical and programming skills that they can count on to achieve this goal.

They may have access to computing resources, academic papers, and digital tools that can help them. And finally, they might hope to make a change in a field of science, such as helping improvemedical imagingor solve a problem that can later be productized, such as creating an algorithm that can detect defects in manufacturing plants.

But before they can develop an algorithm that can meet their goals, they must go through a process of problematization and ground-truthing. During this phase, the researchers must precisely define the problem they want to solve and determine the type of data they need to validate their algorithm. For example, an image classification algorithm determines whether an object is present in an image or not. An object detection algorithm, on the other hand, must also determine the coordinates of the object in the image. Other specifications might also apply, such as whether the image contains only one object or it may contain several objects? Do lighting conditions vary in the environments where the algorithm will be used? Do the objects appear against various backgrounds or do they appear against the same background all the time?

Once the problem is formulated, the researchers must establish the ground truth by collecting the right material that enables them to verify their algorithms and the models they will build in the future. For example, in the case of computer vision algorithms, the researchers will gather a dataset of images that fit the description of their problem and can be used to trainmachine learning models. These images then have to be labeled with the data needed to test the algorithms. For example, if they are creating anobject detectionalgorithm, they must annotate each image with the data for the bounding box of the object(s) it contains.

It is important to understand that the way we think about problems and ground truths will largely affect the algorithms we create and the effects they will have. For example, if an object detection algorithm is derived from ground truths in which objects are centered, it might work well on similar images but fail miserably on images that contain multiple scattered objects. Likewise, if a facial recognition algorithm is only validated against images of people from a specific ethnicity, then it will perform poorly on images of other groups. As Jaton points out inThe Constitution of Algorithms, “we get the algorithms of our ground truths.”

“To consider the question of ground truths head-on would make it possible to communicate the following message: algorithms, as technical devices, can only retrieve things that have already been defined,” Jaton said.

“As soon as an algorithm produces a result, the reflex should thus be to immediately ask the following question: from which ground-truth database does this algorithm derive? This would highlight the intrinsic limitations of algorithms—they are optimization devices—while not undermining their value and beauty—they are arguably the finest and most beautiful optimization devices we have designed to date.”

Examining and documenting the ground-truthing process is extremely important to the study of algorithms and their effects on society, especially as they take on sensitive tasks. There have been numerous instances where poor design has led to algorithms committing critical mistakes such as making wrongfullybiased decisions, creatingfilter bubbles, and promoting fake news. There’s growing interest in understanding and addressing the risks of algorithms. Having a thorough process for studying and documenting the ground-truthing process will be crucial to addressing these risks.

“As long as the practical work subtending the constitution of algorithms remains abstract and indefinite, modifying the ecology of this work will remain extremely difficult,” Jaton writes in The Constitution of Algorithms. “Changing the biases that root algorithms in order to make them promote different values may, in that sense, be achieved by making the work practices that underlie algorithms’ veracities more visible. If more studies could inquire into the ground-truthing practices algorithms derive from, then actual composition potentials may slowly be suggested.”

Programming

Eventually, an algorithm moves to the programming phase, where a set of modules and a list of instructions are created to solve the defined problem and verified against the ground truth. But while this phase is usually reduced to pure source code, Jaton shows in his book that there is a lot more to programming than putting together a list of computer instructions.

“When cognitivists inquire into what makes programs exist, they cannot go beyond the form ‘program’ that precisely needs to be accounted for. In a surprisingly vicious circle that has to do with the so-called computational metaphor of the mind, cognitivists end up proposing numerous (mental) programs to explain the development of (computer) programs,” Jaton writes inThe Constitution of Algorithms.

This problematic view on the practices of programming is deeply rooted in the history of computing, Jaton describes in his book. Scientists, researchers, and companies have tried to frame computers as input-output systems that have been created in the image of the human brain. In reality, however, it was the human mind that was later reimagined as an organic version of the computer.

These metaphors have reduced programming to providing the right set of instructions to a digital brain. They have also shaped how programmers are trained and evaluated, putting more emphasis on instruction writing and disregarding all the other practices that are valuable to developing software.

In his book, Jaton documents his own and his team’s experience in writing instructions, stumbling on bugs, writing intermediary code to zero in on the root of the problem, and consulting with other team members.

He emphasizes the importance of the intermediate steps taken to adjust and improve the program, the interactions between team members in refining the code, and many other steps and practices that are not reflected in the final source code. In reading his book, I got to reflect on my own process in writing code and implementing algorithms and the small details that are important to my work but I take for granted.

“I think that the most overlooked aspect of programming practices are all the small provisionary inscriptions that accompany the writing of programs (and that are then deleted and made invisible),” Jaton said. “I was really surprised to discover all these little writings and experiments that punctuate programming sequences, and integrating them into the analysis of coding practices seems to me to be absolutely crucial to better understand what happens during these engaging moments (because as Donald Knuth rightly says, software is hard).”

The micro-sociological analysis of programming practices is just beginning, Jaton says, and its propositions are therefore still mostly exploratory. Therefore, it is difficult to say where it might lead to. But he is convinced that a mastery of programming practices through in situ micro-sociological studies will enable greater flexibility in algorithmic design.

“A better understanding of whys and wherefores of programming practices would perhaps make it possible to extract what makes good programmers special, in order to then try to infuse this specialness into the algorithmic design communities,” Jaton said.

Formulation

Finally, when an algorithm is implemented and verified against the ground truth, it becomes formulated into a mathematical object that can be later used in other algorithms. An algorithm must stand the test of time, prove its value in applications, and its usefulness in other scientific and applied work. Once proven, these algorithms become abstracted, taken as proven claims that need no further investigation. They become the basis and components of other algorithms, and contribute to further work in science.

But an important point to underline here is that when the problem, ground-truth, and implementation are formulated into an abstract entity, all the small details, and facts that went into creating it become invisible and tend to be ignored.

“If STS has long shown that scientific objects need to be manufactured in laboratories, the heavy apparatus of these locations as well as the practical work needed to make them operative tend to vanish as soon as written claims about scientific objects become certified facts,” Jaton writes inThe Constitution of Algorithms.

“Once there are no more controversies or disagreements about a new scientific object, nature tends to be invoked as the realm that always already contained this constructed scientific object… when facts are certified and enrolled in further studies, the experiments, instruments, communities, and practices that allowed their progressive formation are generally put aside.”

But it is important to understand that algorithms, once formulated, become the foundation of other algorithms, where they will contribute to ground-truthing, programming, formulation, and other practices. Having deeper visibility into the different phases of the constitution of algorithms will better equip us to have constructive discussions about them and investigate their broader impact.

“This conception of algorithms as the joint product of ground-truthing, programming, and formulating activities—themselves often supported by other algorithms that may have undergone analog constituting processes—complicates the overall picture while making it more intelligible,” Jaton writes.

“Indeed, whenever controversies arise over the effect of an algorithm, disputants may now refer to this basic mapping and collectively consider questions such as: How was the algorithm’s ground truth produced? Which formulas operated the transformation of the input-data into output-targets? What programming efforts did all this necessitate? And, if deeper reflections are required, disputants may excavate another layer: Which algorithms contributed to these ground-truthing, programming, and formulating processes? And how were these second-order algorithms constituted in the first place?”

Zooming out and looking at this intricate web of algorithms and their social components that evolve together, we get a bigger picture of how we and algorithms are becoming inexorable parts of each other. This is an area where we still have a lot to learn and reconsider our own views, Jaton says.

“I am coming to think more and more that the fact that we are always building an algorithm on top of other algorithms allows us to remember that the questions of effects and uses remain central to the analysis of algorithms in society,” he told me.

“Indeed, when reading the book, one might sometimes think that the sociology/ethnography of the effects and uses of algorithms is quite distinct from the sociology/ethnography of their conception. But on closer reflection, this distinction is in fact counterproductive: those who design new algorithms are also part of society; they are actors who use and are influenced, to varying degrees, by other algorithms, which they sometimes use to build new ones. It is therefore impossible to completely put aside the question of the effects and uses of algorithms, and it is therefore important to also include the study of the social effects of algorithms within the study of their shaping.”

This article was originally published by Ben Dickson onTechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original articlehere.

Story byBen Dickson

Ben Dickson is the founder of TechTalks. He writes regularly about business, technology and politics. Follow him on Twitter and Facebook(show all)Ben Dickson is the founder ofTechTalks. He writes regularly about business, technology and politics. Follow him onTwitterandFacebook

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

More TNW

About TNW

Ukraine’s new F-16 simulator spotlights a ‘paradigm shift’ led from Europe

This claims to be the ‘world’s most advanced pen’ that digitises handwriting

Discover TNW All Access

New quantum algorithm could simulate industry-changing materials

How this startup battles water waste with leak-detection algorithms