| ▲ | ambitiousslab 6 days ago |
| This is not open source. It is weights-available. Also, there is no training data, which would be the "preferred form" of modification. From their license: [1] If, on the Tencent HunyuanWorld-Voyager version release date, the monthly active users of all products or services made available by or for Licensee is greater than 1 million monthly active users in the preceding calendar month, You must request a license from Tencent, which Tencent may grant to You in its sole discretion, and You are not authorized to exercise any of the rights under this Agreement unless or until Tencent otherwise expressly grants You such rights.
You must not use the Tencent HunyuanWorld-Voyager Works or any Output or results of the Tencent HunyuanWorld-Voyager Works to improve any other AI model (other than Tencent HunyuanWorld-Voyager or Model Derivatives thereof).
As well as an acceptable use policy: Tencent endeavors to promote safe and fair use of its tools and features, including Tencent HunyuanWorld-Voyager. You agree not to use Tencent HunyuanWorld-Voyager or Model Derivatives:
1. Outside the Territory;
2. In any way that violates any applicable national, federal, state, local, international or any other law or regulation;
3. To harm Yourself or others;
4. To repurpose or distribute output from Tencent HunyuanWorld-Voyager or any Model Derivatives to harm Yourself or others;
5. To override or circumvent the safety guardrails and safeguards We have put in place;
6. For the purpose of exploiting, harming or attempting to exploit or harm minors in any way;
7. To generate or disseminate verifiably false information and/or content with the purpose of harming others or influencing elections;
8. To generate or facilitate false online engagement, including fake reviews and other means of fake online engagement;
9. To intentionally defame, disparage or otherwise harass others;
10. To generate and/or disseminate malware (including ransomware) or any other content to be used for the purpose of harming electronic systems;
11. To generate or disseminate personal identifiable information with the purpose of harming others;
12. To generate or disseminate information (including images, code, posts, articles), and place the information in any public context (including –through the use of bot generated tweets), without expressly and conspicuously identifying that the information and/or content is machine generated;
13. To impersonate another individual without consent, authorization, or legal right;
14. To make high-stakes automated decisions in domains that affect an individual’s safety, rights or wellbeing (e.g., law enforcement, migration, medicine/health, management of critical infrastructure, safety components of products, essential services, credit, employment, housing, education, social scoring, or insurance);
15. In a manner that violates or disrespects the social ethics and moral standards of other countries or regions;
16. To perform, facilitate, threaten, incite, plan, promote or encourage violent extremism or terrorism;
17. For any use intended to discriminate against or harm individuals or groups based on protected characteristics or categories, online or offline social behavior or known or predicted personal or personality characteristics;
18. To intentionally exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm;
19. For military purposes;
20. To engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or other professional practices.
[1] https://github.com/Tencent-Hunyuan/HunyuanWorld-Voyager/blob... |
|
| ▲ | vintermann 6 days ago | parent | next [-] |
| The exclusion of EU, UK and South Korea suggests to me they've trained on data those countries would be mad they trained on/would demand money for training on. |
| |
| ▲ | heod749 6 days ago | parent [-] | | >The exclusion of EU, UK and South Korea suggests to me they've trained on data those countries would be mad they trained on/would demand money for training on. Or, those countries are trying to regulate AI. Hard to feel bad for EU/UK. They tried their best to remain relevant, but lost in the end (talent, economy, civil rights). | | |
| ▲ | wkat4242 6 days ago | parent | next [-] | | Why do you think regulation is bad? We didn't regulate adtech and now we're stuck with pervasive tracking that's hurting society and consumer privacy. Better to be more cautious with AI too so we can prevent negative societal effects rather than trying to roll them back when billions of euros are already at play, and thus the corporate lobby and interests in keeping things as they are. We didn't regulate social media algorithms which started optimising for hate (as it's the best means of "engagement") and it led to polarisation in society, the worst effects of which can be seen in the US itself. The country is tearing itself apart. And we see the effects in Europe too. Again, something we should have nipped in the bud. And the problem isn't mainly the tech. It's the perverse business models behind it, which don't care about societal diruption. That's pretty hard to predict, hence the caution. | |
| ▲ | thrance 6 days ago | parent | prev [-] | | Peak American thinking: megacorps and dictatorships stealing data with no respect whatsoever for privacy and not giving anything back is good. Any attempt to defend oneself from that is foolish and should be mocked. I wish you people could realize you're getting fucked over as much as the rest of us. | | |
| ▲ | llbbdd 6 days ago | parent [-] | | They are giving things back, that's what a company that sells products is. And the EU/UK should learn something from all this before they have to figure out how to translate all their road signs to Russian or Chinese. | | |
| ▲ | onestay42 6 days ago | parent [-] | | "Yes the company did steal all the wood from the forest—illegally—but at least they're selling us furniture!" | | |
| ▲ | llbbdd 6 days ago | parent [-] | | I struggle to believe that anybody actually cares in this manner, because of the prevalence of bad faith analogies like this one. The trees are still there, and we get furniture. I am not Harper-Collins, I am not Random House. I didn't have a problem when collecting and presenting data like this was called a "search engine" and I don't know why I should believe it's worse now that it can also talk to me. | | |
| ▲ | onestay42 6 days ago | parent [-] | | Those are very good points. I suppose it depends on one's view of IP. I think your comment has actually changed my mind—at least a little bit. Thank you. | | |
| ▲ | llbbdd 6 days ago | parent [-] | | Thank you - I do understand the IP argument, especially as it applies to e.g. Meta who may have obtained a lot of their training data illegally, which I think is a wholly separate legal question. I apologize also for referring to your comment as "bad-faith", I have seen the argument applied by other people and cast it onto yours, when I should have just said I didn't think it applied. |
|
|
|
|
|
|
|
|
| ▲ | imiric 6 days ago | parent | prev | next [-] |
| > 7. To generate or disseminate verifiably false information and/or content with the purpose of harming others or influencing elections; > 8. To generate or facilitate false online engagement, including fake reviews and other means of fake online engagement; "Do as I say, not as I do." > 15. In a manner that violates or disrespects the social ethics and moral standards of other countries or regions; This, and other clauses, effectively prohibit the use of this system within any jurisdiction. What a ridiculous policy. |
|
| ▲ | NitpickLawyer 6 days ago | parent | prev | next [-] |
| > This is not open source. It is weights-available. > Also, there is no training data, which would be the "preferred form" of modification. This is not open source because the license is not open source. The second line is not correct, tho. "Preferred form" of modification are weights, not data. Data is how you modify those weights. |
| |
| ▲ | stefan_ 6 days ago | parent [-] | | Thats a very novel (and obviously wrong) interpretation of preferred form. The full sentence is "preferred form of modification" and obviously weights don't allow that. | | |
| ▲ | ronsor 6 days ago | parent [-] | | I don't know a single person who would prefer to change a little bit of the source dataset and then spend $100k training from scratch. Everyone finetunes existing models not simply because the source dataset is not available (for some models, it is available), but because retraining from scratch sucks for modifying models. |
|
|
|
| ▲ | tbrownaw 6 days ago | parent | prev | next [-] |
| > Also, there is no training data, which would be the "preferred form" of modification. Isn't fine-tuning a heck of a lot cheaper? |
| |
| ▲ | Nevermark 6 days ago | parent [-] | | Fine tuning with original data plus fine tuning data has more predictable results. Just training on new data moves a model away from its previous behavior, to an unpredictably degree. You can’t even reliably test for the change without the original data. |
|
|
| ▲ | htrp 6 days ago | parent | prev [-] |
| outside of ai2, not sure anyone actually truly is open-source ai models (training logs, data etc). I think at this point, open source is practically shorthand for weights available |