Sustainability is a key driver in consumption choices (White et al. 2019). While consumers increasingly look for eco-friendly products and brands (Harvard Business Review 2023; Insight First 2022), 82% have been convinced to buy a product or service through a video (Wyzowl 2024) and appreciate it as a way to learn more about eco-friendly products or services and stay updated about social and environmental issues of their interest (Think with Google 2019). Consequently, an increasing number of companies are using social media posts to communicate sustainability (e.g., Fernandez, Hartmann and Apaolaza 2022; Kronrod et al. 2023; Verk et al. 2021), combining both visual, textual and audial cues. For instance, Renault posts short videos to communicate features of its new sustainable concept car, Carrefour posts educational videos on how to save energy during our daily activities, while Patagonia combines on-screen text, images and narrating voices to promote instances of activism. Consistent with its relevance, a deal of work on sustainable communication has begun to examine how visual cues – such as certification logos (Pancer et al. 2017) or aesthetic stimuli features (Wang et al. 2017) – or verbal cues – such as the assertiveness of the language (Kronrod et al. 2012) and the message framing (Olsen et al. 2014) – affect consumer behavior. But while it is clear that language and images can affect consumer behavior, less is known about their interplay and the effect of multimodality on social media communication (Packard and Berger 2024; Grewal et al. 2022), particularly in the case of dynamic audiovisual content, such as videos (Stuppy et al. 2023).