Pair Programming with ChatGPT
I was toying with a graph-theory problem when I came across some posts on FaceBook about programs written by ChatGPT. The programs they showed though trivial demonstrated that ChatGPT sure could “think” deep. So I figured, I should see how ChatGPT would help solve my problem.
I can say I am more than impressed. I posed a relatively non-trivial problem and after some iterations came back with an efficient solution along with tests cases.
In this blog I have captured the entire script of my conversation with ChatGPT for others to benefit.
I opened up my conversation with a simple, “Can you help with a graph algorithm problem?”
To which ChatGPT responded very pleasantly that “I’ll do my best”.
I went ahead and described the problem as I would to a colleague, as follows:
“I want to compute a projection of a graph as follows. Given an original graph G with nodes V and edges E, create a projected graph P with edge N, where N is a given subset of V, and edges E’ such that there is an edge (n1, n2) in E’ if there is a directed path from n1 to n2 in E.”
The first solution
I don’t need to speak for ChatGPT. You can look at the response yourself.
Let’s get to business
From the FaceBook posts I was already impressed by ChatGPT. I was not here to test drive it, I wanted to solve the problem. So I chugged on.
Reading the pseudocode I realized I had not formulated the problem correctly, and so went back with my correction. I said, “I need to refine my definition of the projected edge. There should be an edge (n1, n2) if there exists a path from n1 to n2 in E that does not go through another node nk in N.”
Even though the error was mine, ChatGPT came back with an apology. That was a surprise and makes me wonder about the data used to train it. I am sure it can’t be stack overflow or reddit.
C’mon, you can do better
The algorithm was pretty good for a machine to generate. But like I said, I was not here to test ChatGPT. I was here to solve a problem, and that it was being solved by a machine was incidental.
While the algorithm generated was good, it’s one I could have written myself. It followed directly from my definition. So I went on, “This is a good algorithm, and it follows directly from the definition. Can we make it more efficient? It doesn’t seem to make sense to start traversal from n1 for every n2. Can we traverse paths from each n1 and stop when they terminate at an n2 in N.”
To which ChatGPT agreed and came back with a new algorithm. The surprising and exciting part is that ChatGPT went past my definition and deeper into graph theory bringing in “depth first search” and “strongly connected components” in its algorithm and reasoning. Checkout ChatGPT’s response for yourself.
Getting deeper in the weeds
But I wasn’t too happy with the warning that the algorithm “assumes that the graph is strongly connected.” That is not a condition I could live with, so I wanted to confirm that I understood what ChatGPT meant.
Indeed ChatGPT came back describing what an SCC was implying it knew what it was talking about, and went on to tell me how to deal with the case when the graph is not strongly connected.
But I thought there was something just not right about the reasoning. A “strongly connected” graph has cycles. The algorithm it presented worked for acyclic graph. So I pressed on.
ChatGPT acknowledged that the algorithm would work for acyclic graph.
It gets intense
But that didn’t address the original caution it had raised, so I asked again posing a particular scenario.
At this point ChatGPT converted it’s original suggestion about dealing with SCCs into an algorithm.
But it turned out to be a bad solution. It was treating each SCC in isolation of the rest of the graph. In an acyclic graph every node is an SCC, and so this algorithm would not work for an acyclic graph. I brought that up to ChatGPT.
Indeed, ChatGPT accepted the issue and came back with another solution, this time using topological sort.
Time to backtack
Now, I figured this was not going in the right direction, and asked ChatGPT to back track.
It went back to a previous solution.
But now I am getting the impression that ChatGPT doesn’t want to challenge me. It seems to always agree with me and comes back with something, whatever that may be. It is for me to determine whether the answer is right.
I had wanted ChatGPT to go back to the more efficient algorithm. Since we didn’t have any convention for naming algorithm, I asked again.
And it was back on track.
I needed to chew on it for sometime.
ChatGPT gracefully let me, and added that the algorithm works for cyclic and acyclic graphs.
How do we know this thing works?
At this stage my choice was to either formally check the algorithm’s correctness or test it. Both required work, and I wanted to shift the work to ChatGPT. How could I make ChatGPT run the programs? I didn’t think it would anything other than chat. That’s when I got a brainwave. What if I asked ChatGPT to give me test cases for the problem.
Lo and behold, it produced test cases.
This was encouraging. Now all I needed to do was check the tests.
Turned out that the tests were rather trivial. In all the test cases the nodes being projected upon had direct edges between them in the original graph. They didn’t require traversing any paths. I brought it to ChatGPT’s attention.
To which it came back with better tests.
They were still not good enough. After all our discussion about SCC, none of the tests had any cycle.
ChatGPT obliged with more tests.
Now what about the case where ChatGPT has been cautioning about — graphs with multiple SCC. Shouldn’t there be a test for it?
And there it came back with a test.
And an explanation for why the algorithm will fail on it.
Are we speaking the same language?
But this leads me to a new realization. It looks like ChatGPT may be mixing between “strongly connected components” and a “strongly connected graph”.
ChatGPT acknowledges the error.
Well, but the reasoning is still not quite right.
ChatGPT corrects itself again, and this time explaining itself better.
So now we are in sync, it is not the SCC that is an issue here.
With ChatGPT being so polite and apologetic, I too respond with empathy and praise.
ChatGPT is happy to hear and cautions me again about those SCCs.
Let’s check another way
I am still not sure why the algorithm is supposed to fail on that example. I really need to run the program, but how.
Here I get another brainwave: Ask ChatGPT to give an execution trace showing the error.
ChatGPT obliges again.
And explains the missed edge.
There, that was helpful. I see there may be a flaw in ChatGPT’s reasoning.
Alright, We got it
And now we are finally there. We have an algorithm that meets my requirements.
ChatGPT, for some reason, changed how it was presenting the algorithm, this time giving an itemized list.
I don’t think the previous algorithm needed correcting. It was only ChatGPT’s reasoning about it that was off. But I am not complaining. I need to move on.
How about it’s complexity?
It’s as fast as it can get.
Let’s turn it into code and unit tests
Now let’s get the algorithm into code that I can use. I am sure ChatGPT can do it. And while at it, may as well get some unit tests.
Now I am getting frustrated. After the long exchange about test cases, we are back to square one. Should I really have to spell this out.
But ChatGPT is polite, and I keep my cool too.
Now we have some better tests. I am impressed that the test methods are named appropriately too. Good work.
And with that I had the program I needed.
So what do I think of my experience.
In one word, incredible. And I’d further say, it was akin to “pair programming” — developing a program working with another programmer.
I came to ChatGPT to solve a problem and it sure served as a great “partner” in solving it. I could express myself at a level I would to a colleague and it followed me well. It has a very broad vocabulary in regards programming tasks: can write algorithm, create test cases, generated execution traces, reason about correctness, and write syntactically correct programs.
On the caution side, just like human colleagues, ChatGPT is also prone to misunderstanding what you ask, misunderstanding concepts, misuse concepts, and erroneous reasoning. So you cannot automatically assume that the answer it produces is correct. Like colleagues, especially junior colleague, it is too eager to please and may give incorrect answers if you push it in the wrong directions. So it is upon you to check the answer too.
But the most surprising part for me was that ChatGPT, a machine, could make connection between my problem statement to known graph-theory concepts: DFS, strongly connected components, and topological sort. And it applied those concepts to solve the problem without my giving any explicit clue about connections to any of those concepts.