Working paper

Target Variable Engineering

Jessica Clark

Abstract

How does the formulation of a target variable affect performance within the ML pipeline? The experiments in this study examine numeric targets that have been binarized by comparing against a threshold. We compare the predictive performance of regression models trained to predict the numeric targets vs. classifiers trained to predict their binarized counterparts. Specifically, we make this comparison at every point of a randomized hyperparameter optimization search to understand the effect of computational resource budget on the tradeoff between the two. We find that regression requires significantly more computational effort to converge upon the optimal performance, and is more sensitive to both randomness and heuristic choices in the training process. Although classification can and does benefit from systematic hyperparameter tuning and model selection, the improvements are much less than for regression. This work comprises the first systematic comparison of regression and classification within the framework of computational resource requirements. Our findings contribute to calls for greater replicability and efficiency within the ML pipeline for the sake of building more sustainable and robust AI systems.


Under review

Fighting Misinformation on Social Media: An Empirical Investigation of the Impact of Prominence Reduction Strategies

Maya Mudambi, Jessica Clark, Lauren Rhue, Siva Viswanathan

Abstract

Misinformation has dire implications for both public welfare and the operational aims of user-generated content (UGC) platforms. As a result, these platforms have adopted various content moderation strategies aimed at decreasing the volume and impact of misinformation. We study Reddit’s quarantine policy, a prominence reduction strategy that reduces the visibility of misinformation on the platform. We empirically assess the effectiveness, as well as spillover effects, of quarantine. We find that quarantine diminishes misinformation contribution in directly impacted forums, as well as the dispersion of misinformation on the entire platform. However, prominence reduction has the unintended consequence of pushing misinformation to topically related but ideologically neutral spaces. We find that this spillover of misinformation to neutral spaces declines over time, and is not contagious to native users that were previously present. Finally, prominence reduction strategies impact high contribution users (‘superusers’) much more strongly than other users. The prevalence of misinformation on UGC platforms due to ineffectual content moderation strategies poses reputational, regulatory, and financial risks to platform parent firms. The findings from our study have strong implications for platform operations, and we are able to make specific recommendations to practitioners regarding the effective usage of prominence reduction against misinformation. This paper will be relevant for Information Systems and Operations Management researchers who study UGC platform design, content moderation strategies, user online contribution, as well as online misinformation.


Under review

Automated Promotion? A Study of the Fairness-Economic Tradeoffs in Reducing Crowdfunding Disparities via AI/ML

Lauren Rhue, Jessica Clark

Abstract

Digital platforms have a widely-documented issue with racial disparities, which can result in adverse reputational and economic consequences. Equitable promotion of projects across racial groups can mitigate these disparities. Our research explores how to more equitably determine which projects should be promoted by the platform. Platforms typically rely on their employees to decide what content to highlight, but human decisions are subject to cognitive and implicit biases. We examine whether an algorithmic-based approach to choosing which projects to promote can generate more equitable outcomes for people in traditionally marginalized groups while resulting in equivalent economic outcomes. We perform an observational and simulated study on more than 100,000 projects gathered from crowdfunding platform Kickstarter.com to determine whether machine learning models would diversify the set of promoted projects.

Our analysis yields three main findings. First, machine learning models—fairness- unaware and fairness-aware models—identify a more diverse set of projects to promote than those selected by employees. Second, promoting a more diverse set of projects diminishes but does not completely eliminate disparities between racial groups. Third, a more equitable promotion scheme does not substantially negatively affect core business outcomes for the platform. This study contributes to the information systems literature related to using machine learning to reduce racial disparities and to research examining the fairness-economic trade-off. Furthermore, this paper provides a practical path forward for digital platforms who want to increase participation from diverse groups, and for potential crowdfunding participants. 


Under review

Not time but place: Location vs. previous choices for prosocial crowdfunding recommendation strategies

Lauren Rhue, Atiya Avery, Jessica Clark

Abstract

Donors on prosocial crowdfunding platforms have two critical motivations for donors: supporting local causes and supporting social connections. Platform recommendation strategies often leverage prior donor choices; however, donors’ choices may be driven by supply constraints rather than their true preferences. In these instances, a recommendation strategy based on donor attributes such as location may better reflect their true preferences. To understand the effectiveness of these two recommendation strategies, we conducted a randomized experiment with 200,000 donors in partnership with a prosocial crowdfunding platform. Donors were randomly selected to receive project recommendations that were either geographically close to their home or geographically close to their previously supported cause. We found that the local recommendation strategy increased the likelihood of clicks and donations. These results are driven by donors without social connections to the platform, indicating that social motivations supersede geographic motivations and suggesting that digital platforms should consider a hierarchical approach. We also found evidence that the local recommendation strategy yields a rich-get-richer effect. We discuss the implications of our findings for digital platforms as well as the practical implications for our research context of education in the United States.