| ▲ | DGoettlich 3 hours ago | |
fully understand you. we'd like to provide access but also guard against misrepresentations of our projects goals by pointing to e.g. racist generations. if you have thoughts on how we should do that, perhaps you could reach out at history-llms@econ.uzh.ch ? thanks in advance! | ||
| ▲ | myrmidon 3 hours ago | parent | next [-] | |
What is your worst-case scenario here? Something like a pop-sci article along the lines of "Mad scientists create racist, imperialistic AI"? I honestly don't see publication of the weights as a relevant risk factor, because sensationalist misrepresentation is trivially possible with the given example responses alone. I don't think such pseudo-malicious misrepresentation of scientific research can be reliably prevented anyway, and the disclaimers make your stance very clear. On the other hand, publishing weights might lead to interesting insights from others tinkering with the models. A good example for this would be the published word prevalence data (M. Brysbaert et al @Ghent University) that led to interesting follow-ups like this: https://observablehq.com/@yurivish/words I hope you can get the models out in some form, would be a waste not to, but congratulations on a fascinating project regardless! | ||
| ▲ | superxpro12 3 hours ago | parent | prev | next [-] | |
Perhaps you could detect these... "dated"... conclusions and prepend a warning to the responses? IDK. I think the uncensored response is still valuable, with context. "Those who cannot remember the past are condemned to repeat it" sort of thing. | ||
| ▲ | 2 hours ago | parent | prev | next [-] | |
| [deleted] | ||
| ▲ | bondarchuk an hour ago | parent | prev [-] | |
You can guard against misrepresentations of your goals by stating your goals clearly, which you already do. Any further misrepresentation is going to be either malicious or idiotic, a university should simply be able to deal with that. Edit: just thought of a practical step you can take: host it somewhere else than github. If there's ever going to be a backlash the microsoft moderators might not take too kindly to the stuff about e.g. homosexuality, no matter how academic. | ||