View Single Post
Old 04-16-24, 04:33 PM
  #60  
mev
bicycle tourist
 
Join Date: Dec 2007
Location: Austin, Texas, USA
Posts: 2,362

Bikes: Trek 520, Lightfoot Ranger, Trek 4500

Liked 297 Times in 202 Posts
Originally Posted by ericoseveins
Amazing, just a jumble of loosely associated words bordering on total gibberish. As a non-data scientist it's hard for me to conceive of any original source in which those words could've had actual relevant meaning. I'll have to ask my data scientist wife how to understand this. My unlearned hypothesis is that very little of the source data entered into the language model was actually contextually relevant to the question. Is that maybe the right interpretation?
Here is my description of the results.
1. This is an example of a text generation (chatbot) task. Essentially providing a small amount of tokens as input and having the model create probability guesses of what comes next and using that to generate text. A different task would be summarization where we take a large amount of tokens as input and then create a summary. The model was trained on a variety of inputs - essentially encapsulating that knowledge as built in. This isn't the largest model (only 7B) and not highest rated so there could be improvements - though likely also shows limitations.
2. Text generation is essentially a guessing what to give next. Arguably, this example does better at "structure" than "content". I asked it for mileages, stopping places and things to see and the output contains those elements.
3. This model essentially has very little sense of geography or spatial awareness. So I expect the actual content to be garbled - distances to be incorrect and place locations to be mis-matched. Only if some of the inputs it remembered had some of them together would it at least put them together in output. So this also means it will do much worse with small details than generic larger places.
4. There are probably some areas this can improve over time.
- One example is using more of a mixture of experts (MOE) approach. Rather than making the models larger and larger - recognize that some models are set up and trained and set up for different tasks. So essentially have one switchboard deciding the type of task and then dispatching to a sub-model.
- Another example is retrieval augmented generation. For some tasks like riding the C&O, there could be a larger source of existing information, e.g. trip reports or advice columns. Essentially set up the question model with this specialized input and have it handle queries more specific to this data.

The last comment is my own general perception on some of this:
A. There are problem areas that are really hard - and arguably todays large language models are no closer to getting there than 60 years ago. The example that comes to mind was when I in college years ago, I took an AI course. We studied things that came many years before us including some of the first chatbots (Eliza in the 1960s). Those first chatbots essentially worked by taking some input words you provided and then reformatted them back as part of the response. Some people got excited since you could sort-of have a conversation. At that level how long before we reached general intelligence...
B. Needless to say that was a much more difficult task than was believed in the 1960s. AI had a pattern of going into over-enthusiasm and then dis-illusionment and an "AI-winter" when research largely stalled. I think we are perhaps at a similar over-enthusiasm on some aspects and don't think we are any closer to general intelligence than we were when I was in college.
C. I do think there are two aspects that have significantly improved as part of this latest boom:
= We have gotten much better at finding and exploiting patterns. There are some tasks that work well with patterns (e.g. spam detection, some summarization) and some that don't as well (e.g. arithmetic, maps/locations). Some of those hard tasks are generically hard and I don't expect us to solve them anytime soon. Some of the pattern-based tasks can be useful in their own sense. As an example of the difference - I wouldn't ever ask AI to write a letter of recommendation -- however, if I needed to write one, I might ask for an example so I could use it as input on the structure of what one typically looks like. I would use that structure with my own content to create an actual letter and not use any of the words.
= We have also gotten much better at scaling to pull in more and more data. For example the model I used was essentially an encoding of much of the internet captured in a way that the information can fit on a thumb drive.

So I am both excited on some possibilities and deeply skeptical/cynical of some of the results. I've watched some things progress pretty far in the last ten years - while simultaneously seeing limitations that we had when I took my AI class 40 years ago and won't be solved in the next 40 years.
mev is offline