Skip to content

Patent Issued for Utilizing a protected server environment to protect data used to train a machine learning system (USPTO 11449632): DeepIntent Inc. – InsuranceNewsNet

2022 OCT 07 (NewsRx) — By a News Reporter-Staff News Editor at Insurance Daily News — From Alexandria, Virginia, NewsRx journalists report that a patent by the inventors Dakic, Vaso (Irvine, CA, US), Gerritz, Kelly Harold Patrick (Astoria, NY, US), Paquette, Christopher Thomas (New York, NY, US), Perlman, Jennifer Werther (Hillsdale, NJ, US), Romanovski, Pavel (Wallington, NJ, US), Yazovskiy, Anton (Brooklyn, NY, US), filed on December 8, 2021, was published online on September 20, 2022.

The patent’s assignee for patent number 11449632 is DeepIntent Inc. (New York, New York, United States).

News editors obtained the following quote from the background information supplied by the inventors: “The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Further, it should not be assumed that any of the approaches described in this section are well-understood, routine, or conventional merely by virtue of their inclusion in this section.

“Machine learning systems have become popular for solving various types of problems based on training data. A key benefit of a machine learning system is the ability to learn based on data, bypassing any requirements for manual coding of an algorithm. Instead, the machine learning system generates an algorithm or model through repeated computations using the training data.

“A potential drawback of machine learning systems is that determining specific internal operating mechanisms of the core machine learning engine can be difficult. Most machine learning systems are configured to generate fairly complex patterns based on the given training data. Because machine learning systems use complex algorithms and execute continuous learning, determining why a machine learning system produced a particular result from a set of input data can be difficult, if not impossible. In some situations, this can lead to a lack of accountability; in other situations, this feature protects the training data. Because a trained machine learning system exists separately from the training data, any data that is protected or sensitive data can be safeguarded during the use of the machine learning system.

“A trained machine learning system inherently protects the data used to train it. However, the training phase can create issues, especially when the data used to train the machine learning system is robust but protected. Many people provide data under the assurance that data security measures will be used. As an example, the Health Insurance Portability and Accountability Act (HIPAA) has stringent requirements on the protection of medical claims data which would prevent a person from viewing any of the medical claims data to train a machine learning system.

“Additionally, even when information is protected from viewing, the training data or machine learning system can still provide protected information to a viewer. For instance, a machine learning system using ten inputs could memorize a vast majority of people in the United States, thereby providing one-to-one recognition of individuals instead of providing an algorithm that produces a likelihood based on general patterns. But to validate the training data or the machine learning system would generally involve accessing the training data or machine learning system, thereby failing to provide the originally desired protections.

“Thus, there is a need for a system that can protect personal, private, confidential, or otherwise protected information during training and validation of a machine learning system that utilizes the protected information.”

As a supplement to the background information on this patent, NewsRx correspondents also obtained the inventors’ summary information for this patent: “The appended claims may serve as a summary of the disclosure.”

The claims supplied by the inventors are:

“1. A computer-implemented method comprising: storing, using a server computer executing within a protected environment, a plurality of media items, each of the media items corresponding to one of a plurality of different status values; receiving, from a requesting computing device that is outside the protected environment, a request to send certain media items outside the protected environment to a client computing device; computing, using a plurality of machine learning systems executed by the server computer, each of the machine learning systems having been trained with one of the plurality of status values as an output, a plurality of likelihood values associated with a particular status value for the client computing device, each of the machine learning systems having been trained at least in part by receiving, by the server computer executing within the protected environment, instructions to generate and train a particular machine learning system, using attribute values associated with personal data records as inputs, and an existence or a non-existence of a one of the plurality of different status values as outputs, the server computer storing first data comprising a plurality of attribute values for a plurality of the personal data records and second data indicating, for each personal data record of the plurality of personal data records, whether the personal data record has the status value, the server computer being configured to train the particular machine learning system in the protected environment only if the first data and the second data satisfy a first criterion and being configured to send the particular machine learning system to the requesting computing device only if the particular machine learning system satisfies a second criterion; identifying a particular status value, among the plurality of status values, having a highest likelihood value; selecting a specific set of media items at least partly based on the identified particular status value having the highest likelihood value, in a number indicated by the request to send certain media items outside the protected environment to the client computing device; and sending, from the server computer to the client computing device, the specific set of media items that have been selected.

“2. The method of claim 1, further comprising the server computer using the highest likelihood value associated with the particular status value to dynamically price sending media items to the client computing device by determining a charged price by discounting a standard price by an amount corresponding to a percentage value.

“3. The method of claim 1, further comprising the server computer requesting attribute data from an outside attribute database based on information received from the client computing device.

“4. The method of claim 1, further comprising: receiving, from the requesting computing device that is outside the protected environment, particular attributes for the client computing device; and determining, based on the particular attributes, whether to serve a particular media item to the client computing device.

“5. The method of claim 1, further comprising the server computer storing attribute values for a plurality of different client computing devices in an attribute database in the protected environment.

“6. The method of claim 1, the first criterion being a minimum number of instances in the second data of a particular personal data record having the status value.

“7. The method of claim 1, the second criterion being a maximum fraction of population at risk.

“8. The method of claim 7, further comprising computing the maximum fraction of population at risk as a quotient of a number of instances in the subset of the first data of a patient having the status value and a number of positive predictions of the status value from applying the particular machine learning system to each of the plurality of personal data records in the first data.

“9. The method of claim 1, further comprising: training the particular machine learning system with a first set of parameters; and determining that the particular machine learning system does not satisfy the second criterion and, in response, training the particular machine learning system using a second set of parameters.

“10. The method of claim 1, the status being one of a particular medical diagnosis or a particular prescription.

“11. A computer system comprising: one or more processors; and one or more computer-readable non-transitory storage media coupled to one or more of the processors and storing instructions operable when executed by one or more of the processors to cause the system to perform a method comprising: storing, using a server computer executing within a protected environment, a plurality of media items, each of the media items corresponding to one of a plurality of different status values; receiving, from a requesting computing device that is outside the protected environment, a request to send certain media items outside the protected environment to a client computing device; computing, using a plurality of machine learning systems executed by the server computer, each of the machine learning systems having been trained with one of the plurality of status values as an output, a plurality of likelihood values associated with a particular status value for the client computing device, each of the machine learning systems having been trained at least in part by receiving, by the server computer executing within the protected environment, instructions to generate and train a particular machine learning system, using attribute values associated with personal data records as inputs, and an existence or a non-existence of a one of the plurality of different status values as outputs, the server computer storing first data comprising a plurality of attribute values for a plurality of the personal data records and second data indicating, for each personal data record of the plurality of personal data records, whether the personal data record has the status value, the server computer being configured to train the particular machine learning system in the protected environment only if the first data and the second data satisfy a first criterion and being configured to send the particular machine learning system to the requesting computing device only if the particular machine learning system satisfies a second criterion; identifying a particular status value, among the plurality of status values, having a highest likelihood value; selecting a specific set of media items at least partly based on the identified particular status value having the highest likelihood value, in a number indicated by the request to send certain media items outside the protected environment to the client computing device; and sending, from the server computer to the client computing device, the specific set of media items that have been selected.

“12. The system of claim 11, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform using the highest likelihood value associated with the particular status value to dynamically price sending media items to the client computing device by determining a charged price by discounting a standard price by an amount corresponding to a percentage value.

“13. The system of claim 11, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform requesting attribute data from an outside attribute database based on information received from the client computing device.

“14. The system of claim 11, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform: receiving, from the requesting computing device that is outside the protected environment, particular attributes for the client computing device; and determining, based on the particular attributes, whether to serve a particular media item to the client computing device.

“15. The system of claim 11, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform storing attribute values for a plurality of different client computing devices in an attribute database in the protected environment.

“16. The system of claim 11, the first criterion being a minimum number of instances in the second data of a particular personal data record having the status value.

“17. The system of claim 11, the second criterion being a maximum fraction of population at risk.

“18. The system of claim 17, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform computing the maximum fraction of population at risk as a quotient of a number of instances in the subset of the first data of a patient having the status value and a number of positive predictions of the status value from applying the particular machine learning system to each of the plurality of personal data records in the first data.

“19. The system of claim 11, the storage media further comprising instructions which when executed by one or more of the processors cause the system to perform: training the particular machine learning system with a first set of parameters; and determining that the particular machine learning system does not satisfy the second criterion and, in response, training the particular machine learning system using a second set of parameters.

“20. The system of claim 11, the status being one of a particular medical diagnosis or a particular prescription.”

For additional information on this patent, see: Dakic, Vaso. Utilizing a protected server environment to protect data used to train a machine learning system. U.S. Patent Number 11449632, filed December 8, 2021, and published online on September 20, 2022. Patent URL: http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=11449632.PN.&OS=PN/11449632RS=PN/11449632

(Our reports deliver fact-based news of research and discoveries from around the world.)