Data showed the domestic equity market gave up half the gains that it had amassed.
Data can be retrieved even after owners perform a factory reset, researcher says.
Data shows Britons are fearful of relaxing social restrictions as Prime Minister Boris Johnson readies to publish 'road map' for easing lockdown
Data compiled by online comparison site lays bare how much Britons are missing regular pampering during coronavirus lockdown
A new global simulation model offers the first long-term look at how urbanization -- the growth of cities and towns -- will unfold in the coming decades. The research team projects the total amount of urban areas on Earth can grow anywhere from 1.8 to 5.9-fold by 2100, building approximately 618,000 square miles.
Data centres and digital information processors are reaching their capacity limits and producing heat. Foundational work here on optical-acoustic microchips opens door to low-heat, low-energy, fast internet.
Data leaks of this magnitude are virtually unheard of in Canada
Looking for journal articles with associated data sets? New search filters in PMC and PubMed aim to increase the discoverability of articles with associated data information.
In PMC, users can now search on or append searches with filters to discover articles with specific types of associated data, i.e., to find
Alternatively, users can run a search on “has associated data”[filter] to find all articles with any type of data section described above.
In PubMed, users can now search on or append searches with data[filter] to find articles with related data links in either the Secondary Source ID field or the LinkOut – Other Literature Resources field (both located below the abstract). These data links may be to records in other NLM databases (e.g., GenBank) or external data repositories (e.g., figshare, Dryad).
The provision and availability of associated datasets still varies widely from article to article, but it is our hope that this small step helps improve the discoverability of this material and supports wider community efforts to advance science in new directions.
Data from a large US cohort suggest systemic anticoagulation may confer a survival benefit in hospitalized patients without a spike in bleeding events.
The exponential growth of data and artificial intelligence is creating a tug-of-war between data for profit and data for the common good. In this struggle, it is fundamental that we protect our basic human data rights.
Effective and planned shoreline management would trigger activities for tourism, and support development of ocean and beach landscape, conserve biodiversity along with coastal people’s livelihood.
Datasec Solutions Pty Ltd (Datasec or the Company) is an IT security company based in Melbourne, Australia.
Jim Harris shows how data, analytics and humans work together to form the "insight equation."
The post Data, analytics and humans: The insight equation appeared first on The Data Roundtable.
Data-driven businesses use technology as an insight platform to empower nontechnical users.
The post Data-driven to business insights appeared first on The Data Roundtable.
Data-driven is a buzzword again. It feels new and shiny but has been around for years. Yet, people still ask what it means to be data-driven. If you wonder why it’s important to be data-driven, you might ask your bookie. Yes, I said your bookie. In thinking about when I [...]
The post Data-driven is better: Lessons from handicapping the ponies appeared first on Government Data Connection.
Data, collaboration, and the IoT are reframing loyalty for a new age
Data Dashboard to be Updated During Noon Hour Starting Monday Smyrna, DE (April 19, 2020) – Starting Monday, April 20, 2020, the Delaware Division of Public Health (DPH) will begin providing its daily updates on COVID-19 statistics on the de.gov/coronavirus website during the noon hour. Data will reflect the most current information available as of […]
Datask will be its proprietary consumer-based targeted marketing and insights platform.
June 26, 2019 DOVER, DE – The Delaware Department of Insurance recently received notice of a data security breach suffered by Dominion National, an insurer and administrator of dental and vision benefits. On April 24, 2019, through its investigation of an internal alert, Dominion National discovered that servers containing enrollment data, demographic details, and personal […]
The Education Week Research Center looks at student scores on the National Assessment of Educational Progress from 2003 to 2015, a period overlapping with the No Child Left Behind Act.
New data tools allow users to see how public schools fall short when it comes to providing all students the resources they need to meet their highest potential.
New survey data from Education Week show that most K-2 teachers and education professors are using instructional methods that run counter to the cognitive science.
Teachers can feel pressured to use education technology products without knowing how to protect their own and their students' privacy, according to a new online survey by privacy advocacy groups.
Every teacher wants his/her students to be successful and chances are, each teacher is doing so much already with the information he or she has to make that happen. As team leaders, we want to help our teachers leverage the information they have to create the most targeted and effective instruction
Many state and local education agency websites aren't disclosing the presence of third-party tracking services, which can use information about users' browsing.
State structures can make the difference in whether local education research partnerships are effective, according to a new report by the Data Quality Campaign.
New data tools allow users to see how public schools fall short when it comes to providing all students the resources they need to meet their highest potential.
Denver Newsroom, May 7, 2020 / 05:29 pm (CNA).- A Notre Dame sociologist is using data to challenge a Harvard Law professor’s assertions that homeschooling is “dangerous”, and detrimental to society.
The controversy stems from a recent paper by professor Elizabeth Bartholet in which she calls for a presumptive ban on homeschooling in the United States.
Bartholet, as quoted in a Harvard Magazine piece based on her paper, points to unspecified “surveys of homeschoolers” to assert that “up to 90 percent” of homeschooling families are “driven by conservative Christian beliefs, and seek to remove their children from mainstream culture.”
“Some” homeschooling parents are “‘extreme religious ideologues’ who question science and promote female subservience and white supremacy,” she writes.
David Sikkink, associate professor in the Department of Sociology at the University of Notre Dame, analyzed surveys of homeschooling families— including a 2016 government survey— and found that these families are not overwhelmingly Christian nor religious, and are not as universally closed-off to the outside world as Bartholet asserts.
In the analysis Sikkink conducted, just 16% of homeschooling parents said they were homeschooling primarily for religious reasons. The number one reason homeschooling parents cited was a concern about school environment, such as safety, drugs, or negative peer pressure.
Eleven percent of parents reported homeschooling because their child has special needs.
While approximately half of the homeschooling parents surveyed mentioned religion as a factor in their decision to homeschool, Sikkink notes that the parents who cited religion as a reason were, on the whole, more highly educated than those parents who did not.
In terms of Bartholet’s assertion that some homeschooling parents “believe that women should be totally subservient to men and educated in ways that promote such subservience,” Sikkink’s analysis did not find evidence that religious households oppose higher education for girls.
Among the homeschooling families in the survey who use a religious curriculum, there was no difference in their self-reported educational expectations— i.e., what education level they expected their children to reach— for their male children vs. their female children.
Several past studies have shown that homeschool students typically outperform their public and private school counterparts on things like standardized tests and college performance. A 2016 study from the National Council on Measurement in Education showed that, when adjusted for demographic factors, homeschool students were on par academically with their demographically-similar peers.
Moreover, the data Sikkink analyzed suggests that after family background and demographic controls are accounted for, about 64% of homeschoolers “completely agree” that they have much in life to be thankful for, compared to 53% of public schoolers.
On feelings of helplessness, or lack or goals or direction in life, homeschoolers do not substantially differ from their public school counterparts, the analysis suggests.
In the Arizona Law Review, Bartholet argues that while homeschool children may perform as well as their peers on standardized tests or in college, they are also often isolated from their peers and denied experiences and exposures that would make them more productive citizens.
Bartholet claims in her article that “a very large proportion of homeschooling parents are ideologically committed to isolating their children from the majority culture and indoctrinating them in views and values that are in serious conflict with that culture.”
“Isolated families,” she asserts, “constitute a significant part of the homeschooling world.”
In contrast, Sikkink’s analysis found that among the schooling groups surveyed, homeschooling families had the highest level of “community involvement” of all school sectors.
“Community involvement” activities included attending sporting events, attending concerts, going to the zoo or aquarium, going to a museum, going to a library, visiting a bookstore, or attending an event sponsored by a community, religious, or ethnic group.
Homeschooling graduates are almost identical to their public school counterparts in likelihood to vote in federal and local elections, Sikkink found.
Furthermore, the total number of volunteer and community service hours for homeschooling graduates is very similar to or slightly higher than public school graduates, the analysis found.
Bartholet asserts that some homeschoolers “engage in homeschooling to promote racist ideologies and avoid racial intermingling.”
In contrast: “The reality is that about 41% of homeschooled children are racial and ethnic minorities,” Sikkink writes.
“When asked about four closest friends, about 37% of young adult homeschoolers...mention someone of a different race or ethnicity—exactly the same as public schoolers.”
This diversity also extends to schooling practices— increasingly, Sikkink says, homeschooling adopts new forms, including “hybrids” that combine the benefits of home and institutional schooling.
“About 57 percent of homeschoolers are using some form of instruction outside the family,” Sikkink told CNA in an email.
“That includes using tutors, private or public schools, colleges or universities, or homeschooling coops. That percentage would be higher if we included those who reported obtaining curriculum from formal institutions, such as public schools.”
Moreover, about a third of homeschooling parents obtain their curriculum or books from a public school or school district.
“Altogether, 46% of homeschoolers have some pedagogical relationship with public schools,” Sikkink asserts.
Bartholet argues that homeschooling puts children at risk of abuse by their parents, while if children were in public schools, they would be among teachers who are mandatory reporters of any suspected abuse that may be taking place.
“The issue is, do we think that parents should have 24/7, essentially authoritarian control over their children from ages zero to 18? I think that’s dangerous,” Bartholet asserts in the Harvard Magazine piece.
“I think it’s always dangerous to put powerful people in charge of the powerless, and to give the powerful ones total authority.”
Sikkink says Bartholet’s image of a child confined to the home “24/7...from ages zero to 18” is not consistent with the data.
“When we look at the use of homeschooling for each year of the child's upbringing, we only find a small percentage that report that the child was homeschooled for all their years of schooling,” Sikkink told CNA in an email.
Many of these students are part-time public schoolers— about 25% of homeschoolers receive some instruction in public schools during their school-age careers, he wrote.
Homeschooling regulations vary widely by state. Sikkink told CNA he hopes future studies will examine the effects of state-level variation in regulation on homeschooling quality.
“The question of schooling oversight remains, of course, but it would be short-sighted not to keep homeschooling and other creative schooling options in the mix, including the hybrid models that cross sector boundaries,” Sikkink concludes.
Subsequent to the publication of this story, Sikkink told CNA he had revised his assessment of the percentage of homeschoolers using instruction outside the family, from 64% to 57%. The story has been updated to reflect that assessment.
The number of cases of COVID-19 in First Nations reserves continues to rise this week, with 161 confirmed positive cases reported as of May 5.
Keynote speech by Mr Agustín Carstens, General Manager of the BIS, at the 55th SEACEN Governors' Conference and High-level Seminar on "Data and technology: embracing innovation", Singapore, 14 November 2019.
Divyansh Agarwal, Jingshu Wang, Nancy R. Zhang.
Source: Statistical Science, Volume 35, Number 1, 112--128.
Abstract:
Single cell sequencing technologies are transforming biomedical research. However, due to the inherent nature of the data, single cell RNA sequencing analysis poses new computational and statistical challenges. We begin with a survey of a selection of topics in this field, with a gentle introduction to the biology and a more detailed exploration of the technical noise. We consider in detail the problem of single cell data denoising, sometimes referred to as “imputation” in the relevant literature. We discuss why this is not a typical statistical imputation problem, and review current approaches to this problem. We then explore why the use of denoised values in downstream analyses invites novel statistical insights, and how denoising uncertainty should be accounted for to yield valid statistical inference. The utilization of denoised or imputed matrices in statistical inference is not unique to single cell genomics, and arises in many other fields. We describe the challenges in this type of analysis, discuss some preliminary solutions, and highlight unresolved issues.
A large store of data for analysis. Organizations use data warehouses (and smaller 'data marts') to help them analyze historic transaction data to detect useful patterns and trends. First of all the data is transferred into the data warehouse using a process called extracting, transforming and loading (ETL). Then it is organized and stored in the data warehouse in ways that optimize it for high-performance analysis. The transfer to a separate data warehouse system, which is usually performed as a regular batch job every night or at some other interval, insulates the live transaction systems from any side-effects of the analysis, but at the cost of not having the very latest data included in the analysis.
Data about data. In common usage as a generic term, metadata stores data about the structure, context and meaning of raw data, and computers use it to help organize and interpret data, turning it into meaningful information. The WorldWide Web has driven usage of metadata to new levels, as the tags used in HTML and XML are a form of metadata, although the meaning they convey is often limited because the metadata means different things to different people.
Data-space inversion (DSI) and related procedures represent a family of methods applicable for data assimilation in subsurface flow settings. These methods differ from model-based techniques in that they provide only posterior predictions for quantities (time series) of interest, not posterior models with calibrated parameters. DSI methods require a large number of flow simulations to first be performed on prior geological realizations. Given observed data, posterior predictions can then be generated directly. DSI operates in a Bayesian setting and provides posterior samples of the data vector. In this work we develop and evaluate a new approach for data parameterization in DSI. Parameterization reduces the number of variables to determine in the inversion, and it maintains the physical character of the data variables. The new parameterization uses a recurrent autoencoder (RAE) for dimension reduction, and a long-short-term memory (LSTM) network to represent flow-rate time series. The RAE-based parameterization is combined with an ensemble smoother with multiple data assimilation (ESMDA) for posterior generation. Results are presented for two- and three-phase flow in a 2D channelized system and a 3D multi-Gaussian model. The RAE procedure, along with existing DSI treatments, are assessed through comparison to reference rejection sampling (RS) results. The new DSI methodology is shown to consistently outperform existing approaches, in terms of statistical agreement with RS results. The method is also shown to accurately capture derived quantities, which are computed from variables considered directly in DSI. This requires correlation and covariance between variables to be properly captured, and accuracy in these relationships is demonstrated. The RAE-based parameterization developed here is clearly useful in DSI, and it may also find application in other subsurface flow problems.
Gregory J. Matthews, Ofer Harel
Source: Statist. Surv., Volume 5, 1--29.
Abstract:
There is an ever increasing demand from researchers for access to useful microdata files. However, there are also growing concerns regarding the privacy of the individuals contained in the microdata. Ideally, microdata could be released in such a way that a balance between usefulness of the data and privacy is struck. This paper presents a review of proposed methods of statistical disclosure control and techniques for assessing the privacy of such methods under different definitions of disclosure.
References:
Abowd, J., Woodcock, S., 2001. Disclosure limitation in longitudinal linked data. Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, 215–277.
Adam, N.R., Worthmann, J.C., 1989. Security-control methods for statistical databases: a comparative study. ACM Comput. Surv. 21 (4), 515–556.
Armstrong, M., Rushton, G., Zimmerman, D.L., 1999. Geographically masking health data to preserve confidentiality. Statistics in Medicine 18 (5), 497–525.
Bethlehem, J.G., Keller, W., Pannekoek, J., 1990. Disclosure control of microdata. Jorunal of the American Statistical Association 85, 38–45.
Blum, A., Dwork, C., McSherry, F., Nissam, K., 2005. Practical privacy: The sulq framework. In: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. pp. 128–138.
Bowden, R.J., Sim, A.B., 1992. The privacy bootstrap. Journal of Business and Economic Statistics 10 (3), 337–345.
Carlson, M., Salabasis, M., 2002. A data-swapping technique for generating synthetic samples; a method for disclosure control. Res. Official Statist. (5), 35–64.
Cox, L.H., 1980. Suppression methodology and statistical disclosure control. Journal of the American Statistical Association 75, 377–385.
Cox, L.H., 1984. Disclosure control methods for frequency count data. Tech. rep., U.S. Bureau of the Census.
Cox, L.H., 1987. A constructive procedure for unbiased controlled rounding. Journal of the American Statistical Association 82, 520–524.
Cox, L.H., 1994. Matrix masking methods for disclosure limitation in microdata. Survey Methodology 6, 165–169.
Cox, L.H., Fagan, J.T., Greenberg, B., Hemmig, R., 1987. Disclosure avoidance techniques for tabular data. Tech. rep., U.S. Bureau of the Census.
Dalenius, T., 1977. Towards a methodology for statistical disclosure control. Statistik Tidskrift 15, 429–444.
Dalenius, T., 1986. Finding a needle in a haystack - or identifying anonymous census record. Journal of Official Statistics 2 (3), 329–336.
Dalenius, T., Denning, D., 1982. A hybrid scheme for release of statistics. Statistisk Tidskrift.
Dalenius, T., Reiss, S.P., 1982. Data-swapping: A technique for disclosure control. Journal of Statistical Planning and Inference 6, 73–85.
De Waal, A., Hundepool, A., Willenborg, L., 1995. Argus: Software for statistical disclosure control of microdata. U.S. Census Bureau.
DeGroot, M.H., 1962. Uncertainty, information, and sequential experiments. Annals of Mathematical Statistics 33, 404–419.
DeGroot, M.H., 1970. Optimal Statistical Decisions. Mansell, London.
Dinur, I., Nissam, K., 2003. Revealing information while preserving privacy. In: Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principlesof Database Systems. pp. 202–210.
Domingo-Ferrer, J., Torra, V., 2001a. A Quantitative Comparison of Disclosure Control Methods for Microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (Eds.), Confidentiality, Disclosure and Data Access - Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam, Ch. 6, pp. 113–135.
Domingo-Ferrer, J., Torra, V., 2001b. Disclosure control methods and information loss for microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (Eds.), Confidentiality, Disclosure and Data Access - Theory and Practical Applications for Statistical Agencies. North-Holland, Amsterdam, Ch. 5, pp. 93–112.
Duncan, G., Lambert, D., 1986. Disclosure-limited data dissemination. Journal of the American Statistical Association 81, 10–28.
Duncan, G., Lambert, D., 1989. The risk of disclosure for microdata. Journal of Business & Economic Statistics 7, 207–217.
Duncan, G., Pearson, R., 1991. Enhancing access to microdata while protecting confidentiality: prospects for the future (with discussion). Statistical Science 6, 219–232.
Dwork, C., 2006. Differential privacy. In: ICALP. Springer, pp. 1–12.
Dwork, C., 2008. An ad omnia approach to defining and achieving private data analysis. In: Lecture Notes in Computer Science. Springer, p. 10.
Dwork, C., Lei, J., 2009. Differential privacy and robust statistics. In: Proceedings of the 41th Annual ACM Symposium on Theory of Computing (STOC). pp. 371–380.
Dwork, C., Mcsherry, F., Nissim, K., Smith, A., 2006. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. Springer, pp. 265–284.
Dwork, C., Nissam, K., 2004. Privacy-preserving datamining on vertically partitioned databases. In: Advances in Cryptology: Proceedings of Crypto. pp. 528–544.
Elliot, M., 2000. DIS: a new approach to the measurement of statistical disclosure risk. International Journal of Risk Assessment and Management 2, 39–48.
Federal Committee on Statistical Methodology (FCSM), 2005. Statistical policy working group 22 - report on statistical disclosure limitation methodology. U.S. Census Bureau.
Fellegi, I.P., 1972. On the question of statistical confidentiality. Journal of the American Statistical Association 67 (337), 7–18.
Fienberg, S.E., McIntyre, J., 2004. Data swapping: Variations on a theme by Dalenius and Reiss. In: Domingo-Ferrer, J., Torra, V. (Eds.), Privacy in Statistical Databases. Vol. 3050 of Lecture Notes in Computer Science. Springer Berlin/Heidelberg, pp. 519, http://dx.doi.org/10.1007/ 978-3-540-25955-8_2
Fuller, W., 1993. Masking procedurse for microdata disclosure limitation. Journal of Official Statistics 9, 383–406.
General Assembly of the United Nations, 1948. Universal declaration of human rights.
Gouweleeuw, J., P. Kooiman, L.W., de Wolf, P.-P., 1998. Post randomisation for statistical disclosure control: Theory and implementation. Journal of Official Statistics 14 (4), 463–478.
Greenberg, B., 1987. Rank swapping for masking ordinal microdata. Tech. rep., U.S. Bureau of the Census (unpublished manuscript), Suitland, Maryland, USA.
Greenberg, B.G., Abul-Ela, A.-L.A., Simmons, W.R., Horvitz, D.G., 1969. The unrelated question randomized response model: Theoretical framework. Journal of the American Statistical Association 64 (326), 520–539.
Harel, O., Zhou, X.-H., 2007. Multiple imputation: Review and theory, implementation and software. Statistics in Medicine 26, 3057–3077.
Hundepool, A., Domingo-ferrer, J., Franconi, L., Giessing, S., Lenz, R., Longhurst, J., Nordholt, E.S., Seri, G., paul De Wolf, P., 2006. A CENtre of EXcellence for Statistical Disclosure Control Handbook on Statistical Disclosure Control Version 1.01.
Hundepool, A., Wetering, A. v.d., Ramaswamy, R., Wolf, P.d., Giessing, S., Fischetti, M., Salazar, J., Castro, J., Lowthian, P., Feb. 2005. τ-argus 3.1 user manual. Statistics Netherlands, Voorburg NL.
Hundepool, A., Willenborg, L., 1996. μ- and τ-argus: Software for statistical disclosure control. Third International Seminar on Statistical Confidentiality, Bled.
Karr, A., Kohnen, C.N., Oganian, A., Reiter, J.P., Sanil, A.P., 2006. A framework for evaluating the utility of data altered to protect confidentiality. American Statistician 60 (3), 224–232.
Kaufman, S., Seastrom, M., Roey, S., 2005. Do disclosure controls to protect confidentiality degrade the quality of the data? In: American Statistical Association, Proceedings of the Section on Survey Research.
Kennickell, A.B., 1997. Multiple imputation and disclosure protection: the case of the 1995 survey of consumer finances. Record Linkage Techniques, 248–267.
Kim, J., 1986. Limiting disclosure in microdata based on random noise and transformation. Bureau of the Census.
Krumm, J., 2007. Inference attacks on location tracks. Proceedings of Fifth International Conference on Pervasive Computingy, 127–143.
Li, N., Li, T., Venkatasubramanian, S., 2007. t-closeness: Privacy beyond k-anonymity and l-diversity. In: Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on. pp. 106–115.
Liew, C.K., Choi, U.J., Liew, C.J., 1985. A data distortion by probability distribution. ACM Trans. Database Syst. 10 (3), 395–411.
Little, R.J.A., 1993. Statistical analysis of masked data. Journal of Official Statistics 9, 407–426.
Little, R.J.A., Rubin, D.B., 1987. Statistical Analysis with Missing Data. John Wiley & Sons.
Liu, F., Little, R.J.A., 2002. Selective multiple mputation of keys for statistical disclosure control in microdata. In: Proceedings Joint Statistical Meet. pp. 2133–2138.
Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., Vilhuber, L., April 2008. Privacy: Theory meets practice on the map. In: International Conference on Data Engineering. Cornell University Comuputer Science Department, Cornell, USA, p. 10.
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M., 2007. L-diversity: Privacy beyond k-anonymity. ACM Trans. Knowl. Discov. Data 1 (1), 3.
Manning, A.M., Haglin, D.J., Keane, J.A., 2008. A recursive search algorithm for statistical disclosure assessment. Data Min. Knowl. Discov. 16 (2), 165–196.
Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N., 1991. The case for samples of anonymized records from the 1991 census. Journal of the Royal Statistical Society 154 (2), 305–340.
Matthews, G.J., Harel, O., Aseltine, R.H., 2010a. Assessing database privacy using the area under the receiver-operator characteristic curve. Health Services and Outcomes Research Methodology 10 (1), 1–15.
Matthews, G.J., Harel, O., Aseltine, R.H., 2010b. Examining the robustness of fully synthetic data techniques for data with binary variables. Journal of Statistical Computation and Simulation 80 (6), 609–624.
Moore, Jr., R., 1996. Controlled data-swapping techniques for masking public use microdata. Census Tech Report.
Mugge, R., 1983. Issues in protecting confidentiality in national health statistics. Proceedings of the Section on Survey Research Methods.
Nissim, K., Raskhodnikova, S., Smith, A., 2007. Smooth sensitivity and sampling in private data analysis. In: STOC ’07: Proceedings of the thirty-ninth annual ACM symposium on Theory of computing. pp. 75–84.
Paass, G., 1988. Disclosure risk and disclosure avoidance for microdata. Journal of Business and Economic Statistics 6 (4), 487–500.
Palley, M., Simonoff, J., 1987. The use of regression methodology for the compromise of confidential information in statistical databases. ACM Trans. Database Systems 12 (4), 593–608.
Raghunathan, T.E., Reiter, J.P., Rubin, D.B., 2003. Multiple imputation for statistical disclosure limitation. Journal of Official Statistics 19 (1), 1–16.
Rajasekaran, S., Harel, O., Zuba, M., Matthews, G.J., Aseltine, Jr., R., 2009. Responsible data releases. In: Proceedings 9th Industrial Conference on Data Mining (ICDM). Springer LNCS, pp. 388–400.
Reiss, S.P., 1984. Practical data-swapping: The first steps. CM Transactions on Database Systems 9, 20–37.
Reiter, J.P., 2002. Satisfying disclosure restriction with synthetic data sets. Journal of Official Statistics 18 (4), 531–543.
Reiter, J.P., 2003. Inference for partially synthetic, public use microdata sets. Survey Methodology 29 (2), 181–188.
Reiter, J.P., 2004a. New approaches to data dissemination: A glimpse into the future (?). Chance 17 (3), 11–15.
Reiter, J.P., 2004b. Simultaneous use of multiple imputation for missing data and disclosure limitation. Survey Methodology 30 (2), 235–242.
Reiter, J.P., 2005a. Estimating risks of identification disclosure in microdata. Journal of the American Statistical Association 100, 1103–1112.
Reiter, J.P., 2005b. Releasing multiply imputed, synthetic public use microdata: An illustration and empirical study. Journal of the Royal Statistical Society, Series A: Statistics in Society 168 (1), 185–205.
Reiter, J.P., 2005c. Using CART to generate partially synthetic public use microdata. Journal of Official Statistics 21 (3), 441–462.
Rubin, D.B., 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons.
Rubin, D.B., 1993. Comment on “Statistical disclosure limitation”. Journal of Official Statistics 9, 461–468.
Rubner, Y., Tomasi, C., Guibas, L.J., 1998. A metric for distributions with applications to image databases. Computer Vision, IEEE International Conference on 0, 59.
Sarathy, R., Muralidhar, K., 2002a. The security of confidential numerical data in databases. Information Systems Research 13 (4), 389–403.
Sarathy, R., Muralidhar, K., 2002b. The security of confidential numerical data in databases. Info. Sys. Research 13 (4), 389–403.
Schafer, J.L., Graham, J.W., 2002. Missing data: Our view of state of the art. Psychological Methods 7 (2), 147–177.
Singh, A., Yu, F., Dunteman, G., 2003. MASSC: A new data mask for limiting statistical information loss and disclosure. In: Proceedings of the Joint UNECE/EUROSTAT Work Session on Statistical Data Confidentiality. pp. 373–394.
Skinner, C., 2009. Statistical disclosure control for survey data. In: Pfeffermann, D and Rao, C.R. eds. Handbook of Statistics Vol. 29A: Sample Surveys: Design, Methods and Applications. pp. 381–396.
Skinner, C., Marsh, C., Openshaw, S., Wymer, C., 1994. Disclosure control for census microdata. Journal of Official Statistics 10, 31–51.
Skinner, C., Shlomo, N., 2008. Assessing identification risk in survey microdata using log-linear models. Journal of the American Statistical Association 103, 989–1001.
Skinner, C.J., Elliot, M.J., 2002. A measure of disclosure risk for microdata. Journal of the Royal Statistical Society. Series B (Statistical Methodology) 64 (4), 855–867.
Smith, A., 2008. Efficient, dfferentially private point estimators. arXiv:0809.4794v1 [cs.CR].
Spruill, N.L., 1982. Measures of confidentiality. Statistics of Income and Related Administrative Record Research, 131–136.
Spruill, N.L., 1983. The confidentiality and analytic usefulness of masked business microdata. In: Proceedings of the Section on Survey Reserach Microdata. American Statistical Association, pp. 602–607.
Sweeney, L., 1996. Replacing personally-identifying information in medical records, the scrub system. In: American Medical Informatics Association. Hanley and Belfus, Inc., pp. 333–337.
Sweeney, L., 1997. Guaranteeing anonymity when sharing medical data, the datafly system. Journal of the American Medical Informatics Association 4, 51–55.
Sweeney, L., 2002a. Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems 10 (5), 571–588.
Sweeney, L., 2002b. k-anonymity: A model for protecting privacy. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems 10 (5), 557–570.
Tendick, P., 1991. Optimal noise addition for preserving confidentiality in multivariate data. Journal of Statistical Planning and Inference 27 (2), 341–353.
United Nations Economic Comission for Europe (UNECE), 2007. Manging statistical cinfidentiality and microdata access: Principles and guidlinesof good practice.
Warner, S.L., 1965. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association 60 (309), 63–69.
Wasserman, L., Zhou, S., 2010. A statistical framework for differential privacy. Journal of the American Statistical Association 105 (489), 375–389.
Willenborg, L., de Waal, T., 2001. Elements of Statistical Disclosure Control. Springer-Verlag.
Woodward, B., 1995. The computer-based patient record and confidentiality. The New England Journal of Medicine, 1419–1422.
A data-driven method for respiratory gating in PET has recently been commercially developed. We sought to compare the performance of the algorithm to an external, device-based system for oncological [18F]-FDG PET/CT imaging. Methods: 144 whole-body [18F]-FDG PET/CT examinations were acquired using a Discovery D690 or D710 PET/CT scanner (GE Healthcare), with a respiratory gating waveform recorded by an external, device based respiratory gating system. In each examination, two of the bed positions covering the liver and lung bases were acquired with duration of 6 minutes. Quiescent period gating retaining ~50% of coincidences was then able to produce images with an effective duration of 3 minutes for these two bed positions, matching the other bed positions. For each exam, 4 reconstructions were performed and compared: data driven gating (DDG-retro), external device-based gating (RPM Gated), no gating but using only the first 3 minutes of data (Ungated Matched), and no gating retaining all coincidences (Ungated Full). Lesions in the images were quantified and image quality was scored by a radiologist, blinded to the method of data processing. Results: The use of DDG-retro was found to increase SUVmax and to decrease the threshold-defined lesion volume in comparison to each of the other reconstruction options. Compared to RPM-gated, DDG-retro gave an average increase in SUVmax of 0.66 ± 0.1 g/mL (n=87, p<0.0005). Although results from the blinded image evaluation were most commonly equivalent, DDG-retro was preferred over RPM gated in 13% of exams while the opposite occurred in just 2% of exams. This was a significant preference for DDG-retro (p=0.008, n=121). Liver lesions were identified in 23 exams. Considering this subset of data, DDG-retro was ranked superior to Ungated Full in 6/23 (26%) of cases. Gated reconstruction using the external device failed in 16% of exams, while DDG-retro always provided a clinically acceptable image. Conclusion: In this clinical evaluation, the data driven respiratory gating technique provided superior performance as compared to the external device-based system. For the majority of exams the performance was equivalent, but data driven respiratory gating had superior performance in 13% of exams, leading to a significant preference overall.
Head motion degrades image quality and causes erroneous parameter estimates in tracer kinetic modeling in brain PET studies. Existing motion correction methods include frame-based image-registration (FIR) and correction using real-time hardware-based motion tracking (HMT) information. However, FIR cannot correct for motion within one predefined scan period while HMT is not readily available in the clinic since it typically requires attaching a tracking device to the patient. In this study, we propose a motion correction framework with a data-driven algorithm, i.e., using the PET raw data itself, to address these limitations. Methods: We propose a data-driven algorithm, Centroid of Distribution (COD), to detect head motion. In COD, the central coordinates of the line of response (LOR) of all events are averaged over 1-sec intervals to generate a COD trace. A point-to-point change in the COD trace in one direction that exceeded a user-defined threshold was defined as a time point of head motion, which was followed by manually adding additional motion time points. All the frames defined by such time points were reconstructed without attenuation correction and rigidly registered to a reference frame. The resulting transformation matrices were then used to perform the final motion compensated reconstruction. We applied the new COD framework to 23 human dynamic datasets, all containing large head motions, with 18F-FDG (N = 13) and 11C-UCB-J (N = 10), and compared its performance with FIR and with HMT using the Vicra, which can be considered as the "gold standard". Results: The COD method yielded 1.0±3.2% (mean ± standard deviation across all subjects and 12 grey matter regions) SUV difference for 18F-FDG (3.7±5.4% for 11C-UCB-J) compared to HMT while no motion correction (NMC) and FIR yielded -15.7±12.2% (-20.5±15.8%) and -4.7±6.9% (-6.2±11.0%), respectively. For 18F-FDG dynamic studies, COD yielded differences of 3.6±10.9% in Ki value as compared to HMT, while NMC and FIR yielded -18.0±39.2% and -2.6±19.8%, respectively. For 11C-UCB-J, COD yielded 3.7±5.2% differences in VT compared to HMT, while NMC and FIR yielded -20.0±12.5% and -5.3±9.4%, respectively. Conclusion: The proposed COD-based data-driven motion correction method outperformed FIR and achieved comparable or even better performance as compared to the Vicra HMT method in both static and dynamic studies.
(American Association for the Advancement of Science) Ice sheet losses from Greenland and Antarctica have outpaced snow accumulation and contributed approximately 14 millimeters to sea level rise over 16 years (2003 to 2019), a new analysis of data from NASA's laser-shooting satellites has revealed.
Data Governance sounds like a candidate for the most boring topic in technology: something dreamed up by middle-managers to add friction to data scientists’ lives. The funny thing about governance, though, is that it’s closely related to data discovery. And data discovery is neither dull nor additional friction; it’s an exciting process that enables great […]
Datatec tells shareholders that the rapidly spreading coronavirus outbreak has reached every region where the group operates.
If you use data to make critical business decisions, this book is for you. Whether you’re a data analyst, research scientist, data engineer, ML engineer, data scientist, application developer, or systems developer, this guide helps you broaden your understanding of the modern data science stack, create your own machine learning pipelines, and deploy them to applications at production scale.