From the study of sign languages we know that the visual modality robustly supports the encoding of conventionalized linguistic elements, yet while the same possibility exists for the visual bodily behavior of speakers of spoken languages, such practices are often referred to as ‘gestural’ and are not usually described in linguistic terms. This article describes a practice of speakers of the Brazilian indigenous language Nheengatú of pointing to positions along the east-west axis of the sun’s arc for time-of-day reference, and illustrates how it satisfies any of the common criteria for linguistic elements, as a system of standardized and productive form-meaning pairings whose contributions to propositional meaning remain stable across contexts. First, examples from a video corpus of natural speech demonstrate these conventionalized properties of Nheengatú time reference across multiple speakers. Second, a series of video-based elicitation stimuli test several dimensions of its conventionalization for nine participants. The results illustrate why modality is not an a priori reason that linguistic properties cannot develop in the visual practices that accompany spoken language. The conclusion discusses different possible morphosyntactic and pragmatic analyses for such conventionalized visual elements and asks whether they might be more crosslinguistically common than we presently know.