Remix.run Logo
richardfeynman 18 hours ago

This is an interesting dataset to collect, and I wonder whether there will be applications for it beyond what you're currently thinking.

A couple of questions: What's the relationship between the number of hours of neurodata you collect and the quality of your predictions? Does it help to get less data from more people, or more data from fewer people?

n7ck 18 hours ago | parent [-]

1. The predictions get better with more data - and we don't seem to be anywhere near diminishing returns. 2. The thing we care about is generalization between people. For this, less data from more people is much better.

richardfeynman 18 hours ago | parent [-]

I noticed you tracked sessions per person, implying a subset of people have many hours of data collected on them. Are predictions for this subset better than the median?

For a given amount of data, is it better to have more people with less data per person or fewer people with more data per person?

clemvonstengel 17 hours ago | parent [-]

Yes, the predictions are much better for people with more hours of data in the training set. Usually, we just totally separate the train and val set, so no individual with any sessions in the train set is ever used for evals. When we instead evaluate on someone with 10+ hours in the train set, predictions get ~20-25% better.

For a given amount of data, whether you want more or less data per person really depends on what you're trying to do. The thing we want is for it to be good at zero-shot, that is, for it to decode well on people who have zero hours in the train set. So for that, we want less data per person. If instead we wanted to make it do as well as possible on one individual, then we'd want way more data from that one person. (So, e.g., when we make it into a product at first, we'll probably finetune on each user for a while)

richardfeynman 17 hours ago | parent [-]

Makes a ton of sense, thanks.

I wonder if there will be medical applications for this tech, for example identifying people with brain or neurological disorders based on how different their "neural imaging" looks from normal.