Stanford AI Researchers Propose ‘FOCUS’: A Basic Model That Aims to Ensure Perfect Secrecy for Personal Tasks

Machine learning offers the possibility of helping people in their personal activities. Personal tasks range from well-known activities such as categorizing topics in personal correspondence and answering open-ended questions in the context of personal relationships to specialized tasks for individual users. Given the sensitive nature of the personal data needed for these jobs, these systems must ensure that no private information is leaked, that the data is of high quality, and that it is actionable.

The ideal privacy system will provide absolute secrecy – the likelihood of adversaries learning private information does not increase as users interact with the system. Simply training or refining a model on a user’s private dataset is an easy approach to satisfying this traditional privacy guarantee. Recent neural models, on the other hand, require a large amount of training data, but consumers often only have a small amount of labeled data.

Federated learning across data spanning many users has grown as an important method to overcome the problem of individual users lacking sufficient data. Instead of requiring all users to send data to a central location, FL trains a job template by shipping the template between users and a central server.

FL does not allow raw data transmission between devices, but it waives perfect secrecy. Unfortunately, the exposed model can be used to recover confidential information. For the average user, FL increases model performance. Private data, on the other hand, varies greatly from individual to individual, and participants’ performance is often uneven. The training process can also be marred by adversarial participants and central servers. FL requires multiple communication cycles between multiple users to function properly, which introduces common distributed system issues such as device heterogeneity and synchronization. Each personal task that a user wants to accomplish has a cost.

Researchers at Stanford University have recently proposed Basic model checks for user secrecy (FOCUS), a framework for securely serving personal tasks based on a one-way data flow architecture, in response to these issues. FOCUS includes delivering off-the-shelf public FM to silos of private users and using zero-to-a few sample FM adaptation approaches to perform personal tasks with the training examples of zero to a few examples that users have access to.


The researchers used the Bell-LaPadula model, which guarantees absolute secrecy, to formalize the confidentiality guarantee. The BLP model was created for government organizations to manage access control at multiple security levels, corresponding to the configuration of publicly accessible FMs and privately accessible personal data. On 6 of 7 relevant benchmarks from the privacy literature, covering vision and natural language, the team found FM baselines to be competitive with strong FL baselines using coping strategies FM to a few samples.


FOCUS implies that perfect concealment for a variety of personal tasks may be attainable, despite the current emphasis on statistical conceptions of privacy. This is only a proof of concept, and there are a number of issues to address in the future, such as fragility, out-of-domain degradation, and slow inference execution with large large models. The use of FMs has a number of drawbacks. FMs tend to have flashes of knowledge when uncertain, are only available in resource-rich languages, and are expensive to pretrain.

This Article is written as a summary article by Marktechpost Staff based on the paper 'CAN FOUNDATION MODELS HELP US ACHIEVE
PERFECT SECRECY?'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper and github.

Please Don't Forget To Join Our ML Subreddit

Promote your brand 🚀 Marktechpost – An Untapped Resource for Your AI/ML Hedging Needs

Get high-quality leads from a niche tech audience. Enjoy our over 1 million views and impressions every month. Tap into our audience of data scientists, machine learning researchers, and more.

James G. Williams