AdminBuilding_fallcolor.png

Security and Ethical Considerations for Non-U.S. Generative AI Tools

Introduction

Generative AI models offer opportunities as well as risks that must be evaluated, with particular regard to data privacy, regulatory compliance, and censorship or bias.  

Researchers, students, and higher education professionals interested in exploring AI potential at MSU should review the Interim Guidance on Generative Artificial Intelligence (AI) for Research and Creative Activities provided by the Office of Research and Innovation and its partners to understand best practices and considerations for integrating AI tools into their work.

The Evolving Landscape of AI Adoption

As AI adoption grows, assessing the potential implications of different models and services is critical.

Working with generative AI models developed outside the U.S., particularly those developed or operated by federally designated ‘countries of concern’ with differing privacy laws or even political motives, adds an elevated level of data security and content vigilance.

On January 20, 2025, a non-U.S. startup located in China, released an open-source AI model called DeepSeek, which gained significant public attention due to its advanced reasoning capabilities and efficiencies.  

This prompted the need for additional guidance to safeguard research integrity and ensure compliance. While this guidance focuses on DeepSeek, it applies to other generative AI models or services, especially those developed outside the U.S.

DeepSeek Risks

Since its launch, the DeepSeek model has attracted considerable interest in the generative AI space, establishing itself as a strong competitor to leading models like OpenAI's o1. Notably, DeepSeek R1 is fully open source under the MIT license, enabling users to run versions of the model locally and offline.

Model vs. Service

Understanding the distinction between a model and the service is an important baseline for evaluating risk.

  • The Model: DeepSeek R1 is available as an open-source model under the MIT license, allowing it to be downloaded, modified, and run locally on MSU’s own infrastructure. Operating the model in this manner enables the data to be processed internally, thereby mitigating risks associated with data transmission to external servers.
  • The Service:  DeepSeek also offers an online service where users can interact with the R1 model via their website and mobile applications. Utilizing this service involves sending data to DeepSeek's servers, which are located in China, potentially exposing our data to external access and subjecting it to Chinese data regulations.  

Data Privacy Considerations

Using the online service version of DeepSeek R1 poses significant data privacy risks. User data, including research information, could be transmitted to and stored on servers located in China. This raises concerns about compliance with data protection standards and potential government access. Experts have cautioned against using such platforms for sensitive matters due to these potential risks.

MSU Information Security does not recommend the use of DeepSeek's online service through its website or its mobile applications.

Censorship and Bias Concerns

It is important to note that DeepSeek R1 has been observed to exhibit bias and censorship behaviors, particularly on topics sensitive to the Chinese government. For instance, the model may avoid discussions on subjects like the Tiananmen Square protests and Taiwan's political status. This built-in censorship could limit the model's applicability in contexts that require open and unbiased information dissemination.  

Recommendations

  • For Data Privacy: To mitigate data privacy risks, it is advisable to utilize the open-source model of DeepSeek R1 by running it locally on MSU owned infrastructure. This approach ensures that all data processing remains internal, significantly reducing the risk of unauthorized external access.
  • For Censorship: Be aware that even when running the model locally, certain censorship mechanisms may still be present within the model's programming. It is recommended to thoroughly evaluate the model's responses to ensure they meet our standards for open and unbiased information, and to consider alternative models if necessary.
    • In response to the censorship concerns of DeepSeek R1, Perplexity AI has released R1 1776, a post-trained version of the DeepSeek-R1 model designed to provide unbiased, accurate, and factual information. This model has been refined to remove censorship mechanisms while maintaining its reasoning capabilities. Researchers interested in using DeepSeek R1 should be aware of this alternative and consider evaluating R1 1776 for its ability to address previously identified limitations. The model weights are available via Hugging Face.

Ongoing Monitoring

The potential risks associated with DeepSeek R1 are being actively monitored, particularly in light of ongoing legislative actions in the United States. Notably, H.R. 1121, introduced in the U.S. House of Representatives, seeks to ban the use of AI models with ties to foreign adversaries, including DeepSeek. Additionally, New York State has taken steps to prohibit the use of such technologies in public institutions.

MSU will continue to assess developments in this space and provide further guidance as necessary.